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THE INFLUENCE OF TEACHERS ON ASPIRATIONS 
OF STUDENTS: 


HOWARD ROSENFELD anv ALVIN ZANDER 
University of Michigan 


Early studies of gual setting have 
shown that a person usually selects a 
level of aspiration which represents a 
mild challenge for him, given that he 
has knowledge of his past performances 
(Lewin, Dembo, Festinger, & Sears, 
1944). In general, he attempts to reach 
the most rewarding goal that he feels 
he can reasonably attain. It is also 
known that aspirations are often in- 
fluenced by sources external to the 
persons who set them (Festinger, 
1942). Little evidence however is avail- 
able concerning the nature of this in- 
fluence process. 

The interest of this study in the 
effects of teachers on a student’s as- 
pirations was stimulated by the above 
problem and by the awareness that 
appropriate goal setting by a student 
is an important practical issue. A 
salient dimension on which a pupil 
can hardly avoid setting a level of as- 
piration, and can hardly avoid being 
influenced by his teacher when doing 
so, concerns his grades in school. We 
define a student’s level of aspiration as 
that level of achievement, indicated 
by a grade, which a pupil realistically 
expects to attain in a given course. 

For the purposes of this investiga- 
tion, we assume that teachers attempt 
to influence pupils to work up to their 
capacities, and that these pressures 


1 The research reported herein was per- 
formed pursuant to a contract with the 
Office of Education, United States Depart- 
ment of Health, Education, and Welfare. 


are guided by professional norms that 
students should not be expected to 
work beyond or below their capacities. 
We assume that any given student is 
reasonably confident he knows the 
level of his best possible performance 
(his capacity) and that he believes his 
teacher knows his capacity equally 
well. We further assume that students 
are aware of the pressures from 
teachers and that students believe 
they should not be pressed to aspire 
either beyond or below their capacities. 

The attempts of teachers to influence 
students are usually based upon as- 
sumptions as to what will motivate 
pupils to accept these inductions, such 
as rewards, punishments, provision of 
relevant information, and so on. 
French and Raven (1959) have pro- 
posed five separate bases of social 
power whose effectiveness depends 
upon the degree that they stimulate 
forces in the recipient of the influence 
attempts to act in accord with these 
inductions, minus the degree they 
generate forces in the recipient to re- 
sist these inductions. A summary of 
studies on the consequences of these 
different forms of social power is pro- 
vided by French and Raven (1959) and 
by Cartwright (1959). The hypotheses 
considered in the present investigation 
are in large part suggested by French 
and Raven. The derivations of these 
hypotheses are described by those 
authors and will not be discussed here. 

A secondary concern of this study is 
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in the students’ perceptions of the 
valence and the probability of per- 
forming up to his capacity, and how 
these variables intervene between in- 
fluence attempts by teachers and the 
setting of aspirations by students. This 
interest in the consequences of per- 
ceived valence and probability stems 
from assumptions in Lewin’s theory of 
aspiration setting that the level of 
aspiration is a function of the positive 
valence of succeeding, the negative 
valence of failing, and the probability 
of succeeding or failing. Probability is 
measured here in terms of the per- 
ceived difficulty of achieving a capac- 
ity performance. 

Finally, we consider the effects of dif- 
ferent forms of social power upon 
attitudes toward teachers and the sub- 
ject matter of the course. Hypotheses 
tested here were, for the most part, 
proposed by French and Raven. 

The present theoretical orientation 
may be summarized in the following 
model, the terms of which are ex- 
plained below. 


Act of teacher... . 


Form of influence 


By form of influence we refer to the 
bases of social power proposed by 
French and Raven: reward, coercion, 
expert, legitimate, and referent. Each 
of these is defined in the presentation 
of the results. Valence of capacity is 
the degree that a student perceives the 
attainment of his capacity grade as an 
attractive goal. Difficulty refers to his 
perception of the probability that the 
goal is attainable. It is assumed that 
the degree of difficulty perceived by a 
student is determined not only by his 
personal ability, but also by external 
barriers such as the competence of 
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Decision making by Student. . 


Valence of capacity .. . . 
Difficulty of capacity 


teachers and the sufficiency of time and 
help in doing assignments. Degree of 
congruence between aspired grade and 
capacity grade is the distance between 
the aspired grade and the capacity 
grade, divided by 1. 


MeEtTHOD 


Due to the exploratory nature of this re- 
search and the large number of concepts in- 
volved, a written questionnaire was selected 
as the basic instrument. Although it is diffi- 
cult to specify with confidence the direction 
of causality in our correlational results, hy- 
potheses and empirical evidence developed 
in other settings can be used in interpreting 
the most probable direction of causality. 
Conclusions will be stated in terms of hy- 
potheses deemed worthy of study under 
more controlled conditions. 

The relevant concepts were measured 
with questions utilizing Likert-type scales. 
Preliminary versions of this questionnaire 
were developed and revised on the basis of 
intensive interviews with students. The 
questionnaire focused on the aspirations of 
students in mathematics classes and on the 
relationship of students to teachers of math- 
ematics. Mathematics was selected since 
this cgurse is required of all respondents and 
because there should be little ambiguity in 


Decision of student 


Congruence be- 
tween student’s 
aspirations and 

capacity 


students’ minds over the nature of a good 
performance compared to courses using 
more subjective criteria for evaluation of 
their progress. The questions were repeated 
for English courses in order to examine the 
effects of a different course content and the 
consequences of the greater social emphasis 
upon achievement in mathematics which 
supposedly characterizes contemporary so- 
ciety. 

Four-hundred male students, 100 from 
each of four junior high schools comprised 
the sample. Tenth graders were selected be- 
cause they include the oldest group with a 
wide distribution of ability. The four schools 
were chosen to provide a wide range of socio- 
economic status and ability of students. A 
comparison of results for the predicted rela- 
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tionships within each of the four schools re- 
vealed that no important differences exist 
among the schools insofar as the present 
data are concerned. 

Respondents in each school filled-out the 
questionnaire during a 1.5-hour period while 
teachers were absent from the testing room. 
The administration of the questionnaire uni- 
formly occurred shortly after students had 
received their grades for the fourth of six 
marking periods. Thesrecent knowledge of 
their grades provided the students with sta- 
ble evidence of their present level of per- 
formance while permitting the possibility of 
future changes in the grades and in the 
students’ aspirations for grades. 


Validation of Assumptions 


The interpretation of results in this study 
rests upon assumptions stated earlier about 
teachers’ intentions and students’ percep- 
tions of teachers’ acts. The results, on the 
whole, support the reasonableness of these 
assumptions. 

It was assumed that students believe 
their capacity is known by teachers. Evi- 
dence suggesting that this assumption is 
warranted for the purposes of this study is 
provided by the pupils’ answers to the 
query : ‘“‘Do you think your teacher is a good 
judge of your ability?’’ In the responses 67% 
of the students described their teachers as 
‘quite good”’ or ‘‘very good”"’ judges of their 
abilities. It was also assumed that students 
are aware of their own capacities and that 
they support the norm that they should not 
be expected to perform at levels beyond 
capacity. Support for these assumptions is 
found in responses to the question: ‘‘How 
reasonable is your teacher in how well he ex- 
pects you to do?’’ Seventy-nine percent of 
the students replied that their teacher ‘‘ex- 
pects about the right amount from me,”’ 
14% felt that the teacher required ‘‘too 
much,”’ and 7% answered ‘‘too little.’’ These 
percentages also provide indirect support 
for the assumption that teachers are guided 
by professional norms holding that students 
should not be expected to work beyond or 
below capacity. 

A further assumption was that teachers 
expect pupils to work up to their capacity 
level. Over 80% in each of the schools per- 
ceived their teacher as expecting they could 
perform at the level of their capacity, while 
less than 1% said that they had no idea what 
their teacher expected of them. Most stu- 
dents, however, did not view these as strong 
demands. In response to the question: ‘‘Has 


your math teacher ‘pushed’ you to work to- 
ward this [capacity] grade?’’ 55% said that 
they felt ‘‘little or no pressure’ while only 
14% said that they felt “quite a lot’’ of pres- 
sure. 

The aspirations of students may be 
strongly determined by their confidence in 
themselves. We felt it was important, there- 
fore, to determine what effect this personal - 
ity characteristic might have upon students’ 
attitudes toward influence attempts by 
teachers as well as students aspirations and 
achievements. A standardized measurement 
of test-anxiety prepared by Mandler and 
Sarason (1952) was administered to all sub- 
jects. High scores on this measure indicate 
a high fear of failure. 

While test-anxiety was not correlated 
with students’ perceptions of teachers’ in- 
fluence, it was related to aspiration setting 
in a way consistent with findings in previous 
research on the level of aspiration. Atkinson 
has found that persons with high test-anxi- 
ety tend to avoid moderate risks (Atkinson, 
1957). The present data show that students 
with high test-anxiety set aspirations far- 
ther from their present level of performance 
than do students with low test-anxiety. The 
goals set by the more anxious students were 
often unrealistically high. The high goals 
are taken to be unrealistic since it was noted 
that the farther students set their aspira- 
tions from their current grades in mathe- 
matics, the less they were likely to attain 
their levels of aspiration in their final grades 
for the course (r = —.63**).? 


ReEsvULTs 


How much were the students com- 
mitting themselves when they stated 
their levels. of aspiration? To answer 
this question, aspired grades were 
compared with the grades actually 
received by students at the end of the 
semester. The substantial relationship 
between aspired grades and those re- 
ceived at the end of the semester (r = 
.66**) suggests that the aspirations 
were realistic among a majority of stu- 
dents. The aspired grades usually were 
set from one-third to one whole letter 


? Correlation coefficients reported in text 
and tables are marked by asterisks to indi- 
cate the probability values at the .05 (*) 
and .01 or less (**) levels of significance, 
two-tailed test. 
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grade higher than grades received for 
immediately past performances, a 
typical phenomenon in setting aspira- 
tion levels indicating a desire for future 
improvement. 

Our principal concern is in the re- 
lationships of the separate forms of 
power exerted by teachers (as students 
view these matters) with the congru- 
ence between students’ stated aspira- 
tions for grades and their perceived 
capacities. 

Perceived level of capacity was meas- 
ured by the query: 


Not everyone can get A for a final grade 
in mathematics. Many students know that 
they must get something less because every- 
one has an upper limit to his ability. What 
final grade do you think you could get if you 
worked to the limit of your ability and did 
the best you could in mathematics for the 
rest of the semester? 


Aspired grade was obtained by the 
query: 


Students do not always feel like doing 
their best in a certain class. Sometimes they 
are willing to accept a grade which is not as 
good as they could get if they really tried. 
The final grade you will get in your math 
class this semester will depend partly on how 
hard you are going to work for the rest of 
this semester. Considering how hard you 
plan to work, what final grade do you think 
you should get in math this year? 


Effects of Separate Bases of Power 


The most direct attempts to in- 
fluence students are based on the use of 
sanctions: by rewarding or coercing. 
Rewards are given or promised for 
behavior which is in accord with the 
wishes of the inducer. Coercion, based 
upon the ability to punish, is exerted or 
threatened for behavior that is not in 
accord with the wishes of the inducer. 
In a school, a teacher may administer 
sanctions in many ways: by the grades 
he gives, by comments or signs made to 
students, by exclusion from the group 
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TABLE 1 


CORRELATIONS BETWEEN Forms OF 
PoWER AND CONGRUENCE 


(N = 415) 
Congruence of 
Form of power attributed to teacher tad 
and capacity 
Comparative degree of reward- _— 
ing sanction 
Discriminate reward, fre- 07 
quency 
Indiscriminate reward, fre- .03 
quency 
Discriminate coercion, fre- .09 
quency 
Indiscriminate coercion, fre- —.14** 
quency 
Legitimacy of grading Na 
Expertness in grading .09 
Referent status of teacher —_ 
**p < 01. 


or assignment of responsibilities in 
class, by reports to authorities or 
parents, and the like. In order to en- 
compass this variety of approving or 
disapproving cues, it was found neces- 
sary to cast questions about sanction- 
ing acts by teachers in terms of the ap- 
proval or disapproval that students 
perceived teachers have toward them. 
A general measure of the degree that 
sanctioning was perceived as rewarding 
or coercive was sought with the query: 
“On the whole, how much do you feel 
that your math teacher is pleased (or 
displeased) with you compared to the 
rest of the class?’’ It was expected that 
the greater the relative approval re- 
ceived by the student, the greater 
would be the congruence between his 
aspired grade and his capacity grade. 
In the first row of Table 1 it can be 
seen that this expectation is supported. 

Although reward and coercion can 
be conceived as the extreme ends of a 
single dimension, French and Raven 
note that “the distinction between 
these two types of power is important 
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because the dynamics of them are dif- 
ferent.” Extreme punishment of a 
person, for example, may lead him to 
avoid or escape the whole situation in 
which it operates, while receipt of a 
valued reward may make the situation 
more attractive to him. Thus, measures 
were made of the degree that teachers 
reward (“Does your math teacher 
seem to be pleased when you do your 
best?’’) and of the degree that teachers 
coerce (‘‘Does your math teacher seem 
displeased wheri you don’t try very 
hard and your work is not as good as it 
could be?”’). Each of these questions 
was answered on a frequency scale. 
Because the reward and coercion in 
these instances are being given where 
they are ordinarily taken to be appro- 
priate reactions, they are designated 
as discriminate sanctions to distinguish 
them from indiscriminate sanctions, 
discussed in a moment. On the second 
and fourth lines of Table 1 are shown 
correlations between frequency of dis- 
criminate sanctions and congruence. 
The low and nonsignificant correla- 
tions suggest that the degree of dis- 
criminate reward or coercion that 
teachers were perceived to use do not 
affect the degree of congruency be- 
tween aspirations and capacity. 

Two further concepts relevant to 
sanctioning behavior were investigated. 
The first is indiscriminate reward: ‘‘Is 
your math teacher ever pleased with 
your work even when you don’t try 
hard?” The second is indiscriminate 
coercion: “Does your math teacher 
ever seem to be displeased with you 
even when you do your best in class?” 
It can be seen in Table 1 that indis- 
criminate reward has no apparent ef- 
fect upon congruence while indiscrimi- 
nate coercion is inversely related to 
congruence. 

This last finding suggests that in- 
discriminate coercion arouses stronger 
tendencies to resist the teacher’s in- 


TABLE 2 
CORRELATIONS BETWEEN Forms OF 
SANCTION AND PowER, AND DESIRE 

To CoNFOoRM AND LEGITIMACY 

= 415) 


Form of power attributed 
to teacher 


Discriminate coercion 
Indiscriminate coer- 
cion 

Difference** 


Discriminate reward 
Indiscriminate reward 
Difference** 


p < .05. 
< Ol. 


ductions, than to accept them. To test 
such an hypothesis, students were 
asked: “How often do you feel like 
doing the things your math teacher 
wants you to do?”’ Results relevant to 
this hypothesis are shown in the first 
column in Table 2. It is plain that stu- 
dents perceived themselves as less 
ready to conform to a teacher’s de- 
sires when coercion was indiscriminate 
than when it was discriminate. 

It has been found in other research 
that resistance to influence becomes 
greater as legitimacy of influence de- 
creases (see French and Raven). The 
legitimacy of social sanctioning stems 
from the perception, in those being 
influenced, that the influencer is be- 
having in accord with the internalized 
values of the ones being influenced. The 
degree of legitimacy attributed to 
teachers’ behavior was measured by 
the query: “How fair is your mathe- 
matics teacher about most things?” 
The correlations between legitimacy 
and the separate forms of sanctioning 
are shown in the second column of 
Table 2. It can be seen that discrimi- 
nate reward is reliably related to 
legitimacy but indiscriminate reward 
has no relationship with legitimacy. 
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Furthermore, indiscriminate coercion 
is reliably associated with nonlegiti- 
macy while discriminate coercion has 
almost no association with legitimacy. 
We conclude that teachers who are 
considered discriminate in rewards are 
likely to be seen as ‘‘fair,’’ while those 
who are indiscriminate in coercion are 
likely to be seen as “‘unfair.”’ 

French and Raven hypothesize that 
coercion arouses strong resistance in 
the recipient of it so that the inducer’s 
desires are not always acted upon by 
the recipients and, depending upon the 
strength of the resistance aroused, the 
recipients may instead be stimulated 
to do the opposite of what has been 
asked of them. This hypothesis has 
been corroborated by Zipf (1958) and 
Sampson (1960). The present findings 
(and those to be seen in Table 3) sug- 
gest that resistance to coercion in the 
school setting may more readily gen- 
erate negativism when the coercion is 
indiscriminate. 

Legitimate power, we have seen, 
stems from the perception that an in- 
fluencer is behaving in accord with the 
values of the person being influenced. 
In some instances acts by a teacher 
may be perceived as legitimate or non- 
legitimate without their being direct 
attempts to influence the student. An 
example of this type of legitimacy con- 
cerns the fairness of the teacher in 
evaluating the student’s work. The 
degree of this form of legitimacy attrib- 
uted to teachers was measured by the 
question: “If you did your best in math 
class would your teacher actually give 
you the grade that describes your 
ability?” It was expected that students 
who perceived their teacher as more 
legitimate would tend to set levels of 
aspiration closer to their perceived 
capacities since the risk of failure from 
unfair treatment by the teacher would 
be minimal. Evidence reported in the 
sixth row of Table 1 supports this 


prediction. Confidence, then, that one 
will receive the grade he earns if he 
works up to capacity (and not neces- 
sarily a high grade) is associated with 
tendencies to aspire to attain capacity. 
Expert power is based upon the per- 
ceived reliability of the influencer’s 
information. The more an informer is 
perceived as knowing what he is talking 
about, the more the informer is likely 
to influence the recipient of the infor- 
mation. Since we are assuming that 
teachers’ inductions are often placed 
upon the student in the direction of 
working up to capacity, it is evident 
that the teacher’s attempts to influence 
will be more acceptable if the student 
perceives that the teacher knows what 
the student’s capacity is. Thus, expert- 
ness of the teacher was measured in one 
way with the following question: ““Do 
you think your math teacher is a good 
judge of your ability in mathematics?” 
It was expected that greater attribu- 
tion of expertness to the teacher would 
be associated with greater congruence 
between aspiration and capacity. In 
the realm of teaching, however, expert- 
ness is also conceived as skill in the 
substantive content of the course 
being taught. A measure of this type 
of expertness is the following: “How 
much do you think your math teacher 
knows about the mathematics he is 
supposed to teach?” Results reported 
in Table 1 concern only expertness in 
judging the ability of students, since 
the results with the previous measure 
(expertness in math) were not different 
in any important respects. The non- 
significant correlations in the seventh 
row of Table 1 suggest that the expert- 
ness of the teacher does not generate 
greater congruence between aspira- 
tions and capacity. The failure of 
expertness to be related to congruence 
might be explained by French and 
Raven’s statement (1959) that 
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expert power results in primary social in- 
fluences on the person’s cognitive structure 
and probably not on other types of systems. 
Of course, changes in the cognitive structure 
can change the direction of forces and hence 
of locomotion, but such a change of behavior 
is secondary social influence (p. 163). 


Referent power exists in an influencer 
when others desire to be like him. 
Students who are highly attracted to a 
teacher are likely to behave in ways of 
which he would approve, although they 
may not be aware of doing so. It was 
found offensive to students in pretests 
to inquire how much they desired to 
associate with or be like teachers. 
Therefore, measurement of this form 
of power was by means of the question: 
“In general, how much do you like 
your mathematics teacher as a per- 
son?” The prediction that greater 
referent power attributed to a teacher 
would be associated with greater 


congruence between aspiration level 
and perceived capacity was supported, 
as shown in the eighth row of Table 1. 


The effects of the various forms of 
power may be summarized by noting 
the positive effects of rewarding sanc- 
tions, legitimate power, and referent 
power on the congruence between as- 
pirations and capacity, and the nega- 
tive effect of indiscriminate coercion. 
It is noteworthy that the first three 
forms are also significantly positively 
related to one another, indicating that 
they often appear simultaneously in 
the teacher’s behavior and often supple- 
ment one another, as proposed by 
French and Raven. Indiscriminate 
coercive power, on the other hand, is 
negatively related to each of the other 
forms. The first three forms of power, 
we may add, are attributed to teachers 
more often by students who attribute 
high capacity to themselves, while 
coercive power is attributed to teachers 
more often by students who assign low 
capacity to themselves. Nevertheless, 


when perceived capacity is controlled, 
the relationships in Table 1 are not 
substantially lowered and retain their 
statistical significance—students with 
high ability not differing greatly from 
those with low ability. 


Student Performence 


An important consequence of social 
power is the degree that the separate 
forms of power motivate students to 
perform at the level of their aspirations. 
The relationship between the closeness 
of the aspired grades to the actual grade 
the student received at the end of the 
year, and the attribution of reward 
power to the math teacher was r = 
.24**, This relationship is significantly 
greater than its relationship to coercive 
power (r = —.08). Thus, positive 
forms of influence appear to stimulate 
attainment of aspirations more than do 
coercive forms of influence. 


Valence and Difficulty of Attaining 
Capacity 

According to the theory of aspira- 
tion setting proposed by Lewin et al. 
(1944) one specific aspiration level, out 
of a number of possible alternatives, 
is likely to be chosen depending upon 
the degree that it is attractive but not 
too difficult to attain. We thus ex- 
pected to find students placing their 
levels of aspiration closer to their 
perceived capacity the more the capac- 
ity grades were valent for them and 
the less achievement of them was per- 
ceived as difficult. Valence of capacity 
was measured by the question: “How 
good do you think you would feel if 
you did get this grade?”’ Difficulty was 
measured with the query: “How hard 
would you have to work in order to 
receive this grade?” The expectation 
just stated was supported: valence of 
capacity grade is positively related to 
congruence (r = .12*), and difficulty 
of attaining capacity grade is nega- 
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tively related to congruence (r = 
—.27**). It is noteworthy that valence 
and difficulty are positively related to 
each other (r = .26**), supporting the 
Lewin et al. (1944) and the Atkinson 
(1957) findings that a difficult goal 
tends to be more attractive than easy 
ones. 

We had expected to find that dif- 
ferent forms of power would have dif- 
ferent degrees of relationship to the 
valence the student attributed to the 
achievement of his capacity grade. A 
teacher who rewards a student for 
working at capacity, for example, 
might generate a greater desire in him 
to achieve capacity than a teacher who 
punishes him for not doing so. The 
statistical relationships between each 
form of influence and valence of reach- 
ing capacity were, however, consist- 
ently too low to be considered reliable. 
But the teachers’ total amount of 
power in attempting to influence stu- 
dents appears to affect students’ per- 
ceptions of the valence of the capacity 
grade. When all forms of power are 
considered together in a multiple cor- 
relation with valence, the multiple 
correlation between power and valence 
of the capacity grades is .27**. The 
nature of the contributions made by 
the separate forms of power, moreover, 
makes it appear likely that a teacher 
who employs several positive bases of 
power simultaneously, to support his 
inductions on a student to work up to 
capacity, will have greater effect upon 
the valence of doing so than a teacher 
who employs only one positive basis of 
power. Why and how social power can 
have effects upon the valence of a goal 
are problems worthy of future atten- 
tion. 

It seems reasonable that the closer a 
student sets his aspired grade to his 
perceived capacity, the more he will be 
satisfied in attaining this established 
aspiration. This contention was sup- 
ported by a correlation of .32** be- 


tween congruence and valence of 
success, the latter measured by the 
query: “How good would you feel if 
you were given the grade you intend to 
get?” 


Desire to Conform 


A direct determination of the readi- 
ness of students to be influenced by 
teachers was sought by the use of two 
related concepts: perceived desire to 
conform and perceived desire nega- 
tively to conform. The former was 
measured with the question: ‘How of- 
ten do you feel like doing the things 
your math teacher wants you to do?” 
The latter was measured by the query: 
“How often do you feel like doing the 
opposite of what your math teacher 
wants you to do?” All forms of power 
together are strongly related to desire 
to conform (multiple R = .56**). 
Desire to conform, in turn, is related to 
congruence (r = .20**). 

The relationships between the sepa- 
rate forms of power and the desire to 
conform, or to do the opposite, are 
shown in Table 3. Consistent with 
Table 1, indiscriminate coercion was 
related to nonconforming desires, in- 
discriminate coercion and reward were 
negatively related to conforming de- 
sires, and all other forms of power (in- 
cluding expertness) were positively 
associated with conforming desires. 


Attitudes of Students toward Teacher and 
Course 


A final interest of this investigation 
was in the relations between types of 
power and attitudes toward relevant 
aspects of the social setting. Two vari- 
ables were considered here. One query 
asked about changes in attitudes to- 
ward the teacher (“Has your opinion 
of your mathematics teacher changed 
from what it was at first?’’). The other 
inquired about changes in attitude to- 
ward the content of the course (“Do 
you feel any different about mathe- 
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TABLE 3 
CORRELATIONS BETWEEN Forms oF 
Power AND MotTIVATION 
TO CoNFORM 
(N = 415) 


Desire 
negatively 
to conform 


Desire to 


Form of power attributed 
conform 


to teachers 


Comparative degree of 
rewarding 
Discriminate reward, 
frequency | 
Indiscriminate reward, 
frequency 
Discriminate coercion, | 
frequency 
Indiscriminate 
cion, frequency 
Legitimacy of grading | 
Expertness in grading 
Referent status of 
teacher 


.35** 
.29** 


—.11** 

| 

coer- | —.19** | 
| 
.30** | 
.53°* | — 


A. 


matics now than you did before you 
took this math course?’’). Both items 
were scored in terms of direction of 
attitude change as well as intensity of 
change. In Table 4 correlations are 
presented between these two attitudes 
and the different forms of power attrib- 
uted to teachers. The evidence indi- 
cates that each form of power affects 
both of these attitudes in directions 
similar to the effects we have seen for 
grade aspirations, and desires to con- 
form, and in directions suggested by 
French and Raven. 


Results from English Classes 


The results we have thus far ob- 
served concerning teachers in mathe- 
matics were not completely replicated 
when students were queried about 
their English teachers. Similar cor- 
relations were found in mathematics 
and English when relating forms of 
power to desires for conformity and to 
attitude changes toward the teacher 
and the course content. Only legiti- 
mate power, however, was signifi- 


TABLE 4 
CoRRELATIONS BETWEEN ForMS OF 
PowER AND CHANGES 
IN ATTITUDES 

(N = 415) 


Change in 
| attitude 
| toward 
math. 


| Change in 
| attitude 
toward 
teacher | 


Form of power attributed 
to teachers 


——| — 


Comparative degree of | .27** | 
rewarding 
Discriminate reward, 
frequency 
Indiscriminate reward, 
frequency 
Discriminate coercion, | 
frequency 
Indiscriminate coer- 
cion, frequency 
Legitimacy of grading | 
Expertness in grading 
Referent status of 
teacher 


p< Ol. 


cantly related to the congruence of 
aspired and capacity grades in the 
English classes. The fact that legiti- 
mate power appears to have positive 
effects in both mathematics and Eng- 
lish is understandable since legiti- 
macy was rated by the students as the 
most effective source of a teacher’s 
power in response to an inquiry into 
the relative importance of teachers, 
parents, and peers as power figures. 
The failure of the other bases of 
power among English teachers to be 
related to congruence, however, re- 
quires further explanation. The original 
reason for the inclusion of questions 
about English teachers was the expecta- 
tion that students would be less certain 
of the nature of good performance in 
English courses than in mathematics 
courses. We anticipated that influence 
might be more effective in English 
classes because students would be less 
clear about their capacities in that sub- 
ject matter and therefore less confident 
of appropriate aspirations for them- 
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selves, thus being more vulnerable to 
influences from teachers (cf. Festinger, 
1954). Contrary to our expectations, 
more uncertainty about the appro- 
priateness of their aspirations was 
shown in mathematics than in English, 
as indicated by responses to the ques- 
tion: “Do you think you are aiming 
for too high or too low a grade?”’ Com- 
parisons, furthermore, of responses to 
questions concerning the degree that 
teachers reveal their reactions to good 
or bad performances and the adequacy 
of help rendered by teachers showed no 
differences between mathematics and 
English classes. We were led, there- 
fore, to suspect that the apparent in- 
effectiveness of influence attempts in 
English classes was due not to the 
course material itself or to the methods 
of teaching, but to the motivations of 
students. 

An indicator of student concern over 
performance was available in the pre- 
viously mentioned measure of valence 
of success. In mathematics classes, all 
forms of influence except coercion were 
significantly related to the valence of 
successfully achieving the aspired 
grades. In English classes, however, 
only referent power showed this re- 
lationship to a significant degree. It 
appears, then, that the students were 
more eager to do well in mathematics 
than in English. Further weight is lent 
to this interpretation by responses to 
the question: “Which class do you 
think is more important for your 
future?” In the replies, 34 students 
favored English, 152 preferred mathe- 
matics, and the rest saw them as 
equally important.’ It is interesting to 
note that in the literature on level of 
aspiration, experimental predictions 
are better supported the more that 


* The questionnaire was administered to 
students during the spring of 1959, that is, 
during the post-Sputnik emphasis on mathe- 
matics and physical science. 
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subjects have ego involvement in their 
tasks (Lewin et al., 1944; Stotland, 
Thorley, Thomas, Cohen, & Zander, 
1957). 


SUMMARY AND CONCLUSIONS 


A questionnaire was used to explore 
the effects of teachers’ influences upon 
students’ aspirations for achievement 
in school. Hypotheses drawn from 
earlier work on the differential con- 
sequences of separate types of social 
power were tested in a correlational 
analysis. 

1. Tendencies to accept a teacher’s 
influences are aroused in students who 
are subject to reward, legitimate, 
referent, or expert power; while tenden- 
cies to ignore or oppose what teachers 
desire are aroused in students subject 
to indiscriminate coercive influences. 

2. With the possible exception of 
expert power, these tendencies affect 
the degree to which students set their 
aspired grades congruent with their 
perceived capacities. 

3. Two forms of coercion are dis- 
tinguished by students: disapproval of 
inadequate performance, which ap- 
pears to have no effect on aspirations 
or future performance, and disapproval 
even when performance is as good as 
the student feels he can do, which 
seems to have negative effects on 
aspiration setting as well as future 
performance. 

4. Two forms of reward are also dis- 
criminated by students. Tendencies to 
accept a teacher’s influences are 
lowered under indiscriminate reward 
but increased by reward for adequate 
performances. 

5. The positive or negative forces set 
up by the separate bases of power af- 
fect the favorableness or negativeness 
of student attitudes toward teachers 
and course content. 

6. The separate bases of power are 
effective in determining aspirations to 
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the degree that the students are ego 
involved in the performances on which 
they are setting aspirations. 
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The definition of leadership in in- 
teractional terms as typified by Gibb’s 
(1947) discussion has created a new 
frame of reference for the description 
of leadership in behavioral terms. In- 
stead of seeking static “‘traits of leader- 
ship” in an individual, investigators 
have been seeking criteria of “‘leader- 
ship behavior.” For the purposes of 
the present study leadership may be 
defined as the influence of an individual 
in interaction with other individuals 
within a group setting. In recent years 
many promising technics have been 
developed for studying leadership 
behavior. Findings reported by the 
Ohio State Leadership Study Group 
suggest an important concept, namely, 
that leadership acts may be thought of 
as “initiating structure in interaction” 
and ‘“‘showing consideration” (Halpin, 
1956a, 1956b). What appears to be of 
particular significance is the finding 
that these two dimensions may be use- 
ful in distinguishing between effective 
and ineffective leaders. 

The Leadership Behavior Descrip- 
tion Questionnaire (Halpin, 1956b), a 
paper-pencil instrument, developed by 
the Ohio State group is useful in ob- 
taining descriptions of individuals 
already in leadership positions. Such a 
technic is limited in its usefulness for 
assessing emerging leadership. An 
instrument constructed for observation 
of emerging leadership patterns would 
be particularly useful in exploring the 
relationship between various group- 
situational factors and _ successful 
leadership behavior. Such assessments 
would call for “‘on the spot”? observa- 
tions of behavior and their categoriza- 
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tion according to a set of predeter- 
mined criteria. 

This paper describes an attempt to 
observe individuals in small leaderless 
groups and to categorize the emerging 
behavior employing the two dimen- 
sions of leadership behavior mentioned 
above. Before it is possible to estimate 
the predictive validity of such an ap- 
proach, it appears necessary to estab- 
lish that: (a) a group of raters can be 
trained to agree consistently about the 
classification of observed behaviors, 
(b) the behaviors observed are stable 
and consistent from one situation to 
another, and (c) the categories em- 
ployed are sufficiently independent of 
each other to yield pertinent informa- 
tion about each individual in accord- 
ance with the underlying concept of 
effective leadership. 


MeETHOD 


Sample. The subjects were all students in 
the graduate, division at Queens College. 
Each subject was observed while participat- 
ing in two, 30-minute, six-man discussion 
situations. The discussions were held in a 
one-way vision room. The 1958 sample was 
composed of 32 subjects and the 1959 sample 
was composed of 37 subjects. Each year 
there were four trained observers, but only 
two of the observers were present both 
years. 

Raters or Observers. The raters were fac- 
ulty members at Queens College. Training 
sessions were held each year prior to actual 
observation of the subjects. Tape recordings 
of discussion group situations and several 
practice group sessions were analyzed and 
rated utilizing the instrument described be- 
low. The ratings compiled during the train- 
ing sessions were discussed in order to 
clarify and categorize various types of be- 
havior. Once the ‘‘actual’’ sessions started, 
the observers rated independently. 
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The Rating Instrument. The theoretical 
concept of leadership underlying the rating 
instrument is indicated in the introductory 
comments. Its major emphasis is on those 
overt acts of members of discussiou groups 
which may be classified according to the 
categories, described by Halpin (1956a) as: 
“initiates structure in interaction’ and 
“shows consideration.’”’ In some of the 
earlier experimental data obtained by the 
authors and in a later formulation by Hemp- 
hill (unpublished) it was indicated that a 
clearer distinction should be made between 
attempts to initiate structure and success in 
this operation. In the present technic, the 
latter behaviors are recorded as influence 
acts. 

In recording an individual’s behavior, if 
the person identifies a problem, suggests 
some procedure for finding a solution to a 
problem, or opens a channel of communica- 
tion for another person, he is accorded a 
tally in the category, ‘‘attempts to initiate 
structure.’’ If there was some overt ac- 
knowledgment by another group member of 
the “attempt to initiate structure,’ the 
individual is accorded an influence tally. If 
a person identifies himself as a member of 
the group, supports another’s point of view, 
or is considerate of another’s feelings or 
attitudes, he receives a plus tally in the 
category, “shows consideration.’’ If, on the 
other hand, he is sarcastic, caustic, or at- 
tacks another group member, a minus tally 
is recorded. Thus, in this area, a group mem- 
ber may receive both positive and negative 
tallies. 

Procedures. Each of the subjects was ini- 
tially assigned at random to a discussion 
group. In 1958, assignments to the second 
discussion group were based on the prin- 
ciple that the group members should be new 
to each other. In 1959, the group member- 
ship was held constant in both situations. 
As far as possible all conditions were stand- 
ardized. Based on previous research (Wilson 
& Robbins, 1955), six-person discussion 
groups were maintained wherever possible.' 
Each discussion group was assigned the 


1 Due to scheduling difficulties it was not 
possible to have all groups with six persons. 
The majority of the groups did have this 
number, however, one group had four mem- 
bers and another seven. Research suggests 
that although six is an optimal number of 
participants, there are not statistically sig- 
nificant differences between groups when 
four or seven people are observed (Wilson & 
Robbins, 1955). 


same discussion problem when the members 
were seated in the one-way vision room. 
Each session lasted 30 minutes. 

Scores. Although an individual had to 
attempt to initiate structure (Category 1) 
before he could influence the structure 
(Category 2) these acts theoretically repre- 
sent different types of behavior. The scores 
in Categories 1 (attempts to initiate struc- 
ture) and 2 (influence in initiating structure) 
were obtained by summing the number of 
tallies recorded for each subject by each 
rater. The ‘‘score’’ for each subject in 
Category 3 (shows consideration) was the 
difference between positive and negative 
tallies. Rater reliability, subject consist- 
ency and stability have been estimated for 
each category and are reported below. 

Analysis of Data. The analysis of variance 
technique was utilized to estimate the sub- 
ject consistency, subject stability, and rater 
agreement coefficients for each leadership 
behavior category. Initially coefficients of 
rater agreement were obtained for each 
problem situation as well as for both situa- 
tions combined (cf. Table 1). On the basis 
of these coefficients of rater agreement it 
was decided to combine the subject’s ratings 
within each leadership behavior category 
over both problem situations. These com- 
bined ratings were treated to estimate the 
subject consistency (reliability) and sta- 
bility coefficients for each leadership be- 
havior category. Since the problem situa- 
tion conditions (e.g., changing membership) 
were different for the 1958 and 1959 samples 
the data are analyzed independently. 

The coefficients of rater agreement in 
each situation for each leadership behavior 
category were estimated by the formula: 


Ragreement (S2 = 


where S,? equals the between subjects mean 
square and S,’, the residual error mean 
square. The coefficients of rater agreement 
for combined situations were derived by the 
formula: 


Ragreement (S2 = S2,)/S2 


In this formula the S?, stands for the vari- 
ance error due to subjects by raters inter- 
action. This coefficient may be interpreted 
as the correlation between ratings of N 


2 The writers wish to express thanks to 
Donald Medley, Division of Teacher Educa- 
tion, Bureau of Research, Board of Higher 
Education of New York City, for his help in 
the statistical analysis of the data. 
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judges and ratings of N other judges on 
these same performances. 

The coefficients of subject consistency 
(reliability) for each Leadership Behavior 
Category was estimated by using: 


Reons. = St» = + S2)/S2 


The only new term added here, Sip, equals 
the variance from errors due to the subjects 
by problems interaction. These coefficients 
may be interpreted as the correlations de- 
tween ratings by N judges on two problems 
and ratings by N judges on two other prob- 
lems. In obtaining the coefficient of sta- 
bility, the formula was 


Raw. = (S? 


This may be interpreted as the correlation 
between ratings on two problems and rat- 
ings on two other problems by the same 
judges. The assumption underlying the 
derivations of the various formulae were 
based on Model II considerations as dis- 
cussed by Medley, Mitzel, and Doi (1956). 


RESULTS 

Coefficients of rater agreement rele- 
vant to each Leadership Behavior 
Category for each problem situation 
and for situations combined are pre- 
sented in Table 1. The coefficients for 
“Attempts to initiate structure” 
ranged from .52 to .74 in 1958 for a 
sample of 32 subjects and from .73 to 
89 in 1959 for a sample of 37. The 
relationship between observer ratings 
for the category “Success in initiating 
structure” ranged from .81 to .82 in 
1958 and from .83 to .88 in 1959. On the 
“Shows consideration” category the 
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TABLE 1 


coefficients ranged from .73 to .92 and 
from .70 to .92 in 1958 and 1959, 
respectively. All the coefficients pre- 
sented are statistically significant well 
beyond the .01 level of confidence. 
Comparing the 1958 estimates of rater 
agreement with those of 1959, small 
increments in the latter coefficients 
may be noted. This may be due to in- 
creased sampling, or to the fact that at 
least two of the judges had had an 
additional year of training and experi- 
ence. 

Subject consistency and stability 
within each Leadership Behavior Cat- 
egory over two problem situations are 
presented in Table 2. The consistency 
coefficients range from .07 to .51 in 
1958 and from .58 to .92 in 1959. The 
coefficients pertinent to stability of 
performance ranged from .07 to .51 
for 1958 and from .59 to .85 in 1959. 
In 1959 all the coefficients were statis- 
tically significant beyond the .01 level 
of confidence. It may be noted that in 
all categories the estimates of consis- 
tency and stability are considerably 
higher for the 1959 data than for the 
1958 data. More will be said about 
this below. 

Table 3 presents correlation coef- 
ficients pertinent to the degree of 
independence, or overlap, between 
each of the Leadership Behavior 
Categories. These coefficients were 
computed, by the Pearson product- 


CoEFFICIENTS OF RATER AGREEMENT FOR Eacu Discussion SITUATION AND 
OVER CoMBINED Discussion SITUATIONS 


Situation 1 


Situation 2 Combined 


Leadership Behavior Categories 


1958 
(N = 32) | 


Attempts to initiate structure | 
Success in initiating structure | ,82 


Shows: consideration 


.92 | 


| 1988 1959 | 1958 | 1959 
| 

.89 52 73 | .74 | 

81 83 | | .88 
| 


Note.—All coefficients are significant at the .01 level of confidence. 
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TABLE 2 


CoEFFICIENTS OF STABILITY AND CoNSIST- 
ENCY FOR THREE LEADERSHIP 
BEHAVIOR CATEGORIES OVER 

ComBINED Discussion 
SITUATIONS 


Attempts to initiate 
structure 

Success in initiating 
structure 

Shows consideration 


-51*}. 


|-51*) 92° 


@® * Group membership in each discussion situation was 


ip in each di situation was 


> Group 
the same. 
* Significant at the .01 level of confidence. 


TABLE 3 
RELATIONSHIP BETWEEN LEADERSHIP Be- 
HAVIOR CaTEGORIES (COMBINED 
SITUATIONS) AND PERCENTAGE 
or Common VARIANCE 


Percentage 


Leadership Behavior Categories | of Overlap 


Attempts/Success 
Attempts/Consideration |. 00 
Success/Consideration | 65 37| 41 


moment method, for the 1959 data only 
and are based on the combined ratings 
for each subject. Only the correlation 
between the Success and Considera- 
tion categories was found to be signifi- 
cant at beyond the .01 level of confi- 
dence (r = .65). The coefficients 
between Attempts and Success (r = .03) 
and Attempts and Consideration (r = 
.00) suggest little or no relationship. 


Discussion 


The findings of the present study 
may be examined in terms of the three 
issues raised in the introduction to this 


paper. 


Rater Agreement 


The data presented in Table 1 seem 
to support the notion that raters 
(obervers) can be trained to observe 
an individual’s behavior in a small 
group setting and agree consistently 
with other observers about which acts 
demonstrate “attempts to initiate 
structure,” “success (or influence) in 
initiating structure,” or “shows con- 
sideration.”’ This finding in itself is not 
particularly significant, having been 
demonstrated any number of times in 
the literature (Bales, 1950; Bass, 1954). 
What may prove of some importance, 
however, is that the raters agreed 
about acts which are theoretically 
related to effective leadership as dem- 
onstrated by the Ohio State studies of 
leadership. 


Independence of Leadership Categories 

Although the raters agree with each 
other with a high degree of consistency 
it is important to determine the magni- 
tude of the relationship among the 
three Leadership Behavior Categories. 
One may ask: “Are the raters actually 
discriminating between the various 
types of behavior observed?” Table 3 
presents data which suggest that there 
is probably no overlapping between 
“attempts” and “consideration” and 
between “attempts” and “success or 
influence.” The relationship between 
“success” and “consideration” indi- 
cates that there is approximately 41% 
overlap in variances. Thus one may 
conclude tentatively that each cate- 
gory is somewhat independent and may 
be measuring important attributes of 
effective leadership. On the basis of the 
Halpin (1956a) work on effective 
leadership it may be postulated that a 
portion of the relationship between 
“success” and “consideration” may be 
due to a third variable, i.e., effective 
leadership. It seems reasonable to 
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contend that the three Leadership 
Behavior Categories are probably in- 
dependent enough of each other to be 
useful measures. 

From another point of view, it may 
be argued that, an individual’s ‘‘at- 
tempts to initiate structure”’ have little 
to do with whether or not he shows 
consideration toward the other group 
members. On the other hand, one 
must not conclude that “attempts” 
and “success” are not related. The 
present scoring system minimizes the 
possibility of correlation between “at- 
tempts” and “influences.” At present 
a@ new scoring system is being sought, 
but results are not yet conclusive. 
Comparison of the estimates of rela- 
tionship between the three categories, 
however, seems to support the argu- 
ment that “successful initiation of 
structure (i.e., influence on the group 
problem solving procedures)” is related 
to consideration shown by the influen- 
tial individual to other group members. 
Although one may try to influence his 
peers, apparently one must also show 
consideration for them if he is to suc- 
ceed in influencing the group problem 
solving procedures. 


Stability and Consistency 

If any instruments to proven useful 
for predictive purposes or for making 
generalizations, it is necessary to assess 
behaviors which are stable and con- 
sistent. Analysis of the 1959 data sug- 
gests that subject performance is 
recorded as reasonably stable and con- 
sistent in all three Leadership Behavior 
Categories by the four raters. It is 
interesting to note that the AIS acts 
are observed as less stable and con- 
sistent than acts in the other two cat- 
egories. The major source of error 
variance influencing all of the coeffi- 
cients of stability and consistency 
(this is true for 1958 and 1959 data) is 


the subject X problems interaction. It 
may be hypothesized that the major 
source of variation influencing the S X 
P interaction term involves the con- 
stancy of the group membership and 
the nature of the problem presented in 
each situation. In the case of the suc- 
cess and consideration categories the 
coefficients of stability and consistency 
are increased considerably in 1959 by 
holding group membership constant 
and presenting two similar problems. 

Examination of the coefficients rele- 
vant to the AIS category indicates 
that the magnitude of the increments 
is considerably less than for the Suc- 
cess or Consideration categories. The 
present design does not permit any 
clear explanation of these findings. It 
seems logical to contend, however, 
that a person has more cognitive con- 
trol over his attempts to initiate struc- 
ture than his behavior in the other two 
categories. The “success or influence” 
behavior is, in reality, a measure of 
the other group members’ reactions to 
Individual A’s attempts to initiate 
structure. Hence, Person A has little 
“control” over the stability of these 
“‘acts’”’ except as he chooses not to at- 
tempt to initiate structure. The “at- 
tempts to initiate structure” acts ap- 
pear to be primarily cognitively 
oriented, whereas, the consideration 
acts appear to be basically noncogni- 
tive. If this hypothesis is tenable then 
Person A could not control the non- 
cognitive acts as directly as he could 
control the “attempts” he makes to 
initiate structure. On the basis of the 
data presented in Table 2 it seems 
probable that the individual who is 
perceived as considerate and who at- 
tempts to initiate structure during the 
first group session hag a higher proba- 
bility of success thah the individual 
who is not seen as considerate. It seems 
likely that the individual who was suc- 
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cessful during the first group session 
will continue to be successful during a 
second situation if the group member- 
ship is unchanged and the nature of the 
problems is similar. On the other hand, 
the individual who attempts to initiate 
structure but is not influential (for 
whatever reason) must change his 
behavior in the next group discussion. 
Hence, he may either make more or 
fewer attempts during the next session. 
Thus, his “attempts” behavior would 
seem less stable and consistent. Such 
hypothesizing is, of course, extremely 
tentative but may indicate some of the 
interesting problems about which the 
present technic may help to provide 
more information. 

The instrument may have consider- 
able use for both research purposes 
and more pragmatic ends. For ex- 
ample, the classification of leadership 
behavior into three categories permits 
potential clarification of the rélation- 
ship between various leadership acts 
and other measures. A second project 
might be to find the relationship be- 
tween various group dimension vari- 
ables and leadership behavior cate- 
gories. This technic might be useful in 
designing selection procedures where 
vocational success is highly related to 
effective leadership behavior. 

The present instrument seems to 
satisfy the criteria for content validity, 
rater reliability, and overall reliability. 
The next research steps are to find out 
whether this technic also satisfies the 
criteria for construct, concurrent, and 
predictive validity. 


SUMMARY 


Thirty-two subjects in 1958 and 37 
subjects in 1959 were observed during 
two leaderless group discussions by 
four trained observers. Each subject’s 


leadership behavior was categorized as 
“attempts to initiate structure,” ‘“‘suc- 
cess in initiating structure,” or “‘shows 
consideration.” 

The findings suggest: (a) Raters can 
be trained to agree consistently about 
the classification of leadership behav- 
iors observed during a group discussion. 
(b) The behaviors in the “success” and 
“consideration” categories are highly 
stable and consistent from one discus- 
sion situation to another when the 
group membership is held constant 
and when problems of a similar nature 
are being discussed. When these con- 
ditions do not prevail the leadership 
behaviors are not particularly stable or 
consistent. When the group member- 
ship is held constant the attempts to 
initiate structure acts are not as 
stable or consistent as the other two 
types of acts. (c) The three categories 
seem sufficiently independent to be 
considered as three distinct types of 
observed behaviors. The correlation 
between success and consideration 
seems to suggest a mediating variable, 
i.e., effective leadership. 

The technic is seen to have potential 
value for leadership—group situa- 
tional—organismic variable research. 
Considerably more work is necessary 
to determine whether the instrument 
satisfies the criteria for predictive, 
concurrent, and construct validities. 
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Recent work (Dreger & Aiken, 1957; 
Dutton, 1954; Gough, 1954; Poffen- 
burger & Norton, 1956; Tulock, 1957) 
suggests that performance in mathe- 
matics is influenced by nonintellective 
as well as intellective variables. Mathe- 
maphobia (Gough, 1954), i.e., pro- 
nounced fears in the presence of arith- 
metic and mathematics, and other 
negative attitudes toward mathe- 
matics demand explanation. The 
simplest explanation is that such re- 
actions result from experiences specific 
to the learning of mathematics, in 
particular that the manner in which 
significant others, viz., teachers and 
parents, instruct children in mathe- 
matics is the primary determinant of 
their attitudes toward this subject, 
referred to here as “math attitudes.” 
The present investigation provides a 
limited test of the direct experience 
hypothesis of the etiology of math 
attitudes by studying, first, relations 
between selected intellective and non- 
intellective variables and math atti- 
tudes and, second, the contributions of 
these attitudes to the prediction of 
achievement in mathematics. 


HYPOTHESES 


Relation of Math Attitudes to Achieve- 
ment Measures 


Final mathematics course grades. 
Math attitude scores make a significant 


1The authors wish to thank Lyle V. 
Jones of the University of North Carolina 
Psychometric Laboratory for his help in 
planning and implementing this investiga- 
tion. 
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contribution to the prediction of final 
grades in a mathematics course. 

Mathematics achievement test 
changes. Math attitude scores predict 
gains in scores from initial to final 
administration of a mathematics 
achievement test when training has 
intervened. 


Relation of Math Attitudes to Person- 
ality Measures 


Temperament. Math attitudes are 
unrelated to specified “‘general person- 
ality” variables. 

Ability. Math attitudes are posi- 
tively correlated with numerical abil- 
ity. 


Relations of Math Attitudes to Experi- 
ences with Mathematics 


Ratings of mathematics teachers. 
Math attitudes are positively cor- 
related with subjects’ ratings of former 
mathematics teachers. 

Reported parental encouragement. 
(a) Math attitudes are positively cor- 
related with subjects’ reports of early 
parental encouragement of mathe- 
matical endeavors. (b) Math attitudes 
are unrelated to subjects’ reports of 
encouragement of studying academic 
subjects in general. 

Reported parental attitudes toward 
mathematics. Math attitudes are posi- 
tively correlated with subjects’ reports 
of parents’ own math attitudes. 

Reported traumatic experiences wiih 
mathematics. Math attitudes from 
favorable to unfavorable are correlated 
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negatively with the number of frustrat- 
ing or embarrassing situations associ- 
ated with mathematics. 


MeETHOD 


Measures. Paragraphs describing atti- 
tudes toward mathematics written by 310 
college students were reduced to scaled 
items according to Likert’s procedure (Ed- 
wards, 1957) to constitute the basis for the 
Math Attitude Scale. The final scale con- 
sisted of 10 items connoting negative atti- 
tudes and 10 connoting positive. Sample 
items are :* 

8. Mathematics makes me feel uncom- 
fortable, restless, irritable, and impatient. 

13. I approach math with a feeling of 
hesitation—hesitation resulting from a 
fear of not being able to do math. 

18. I love mathematics, and I am hap- 
pier in a math class than in any other 
class. 

5. Mathematics makes me feel secure, 
and at the same time it is stimulating. 
Preliminary investigation using this 

scale attested to its reliability (r = .94 for 
test-retest). In addition, a test of independ- 
ence between the scores on the attitude 
scale and scores on four items designed to 
measure attitudes toward academic sub- 
jects in general suggested that attitudes 
specific to mathematics were being meas- 
ured (x? = .80, df = 1). 

Besides the main nonintellective meas- 
ure, the Math Attitude Scale, the Min- 
nesota Counseling Inventory (Berdie & 
Layton, 1957) and the Intensive Personal 
Data Sheet were selected to assess nonintel- 
lective variables. The MCI was chosen not 
only because as a group inventory it met 
time requirements but also because it ap- 
pears to incorporate some of the better 
features of its well-known cousins, the 
MMPI and the Minnesota Personality 
Scale, and to assess variables expected to 
relate to academic performance. The IPDS, 
developed at the University of Southern 
California, was adapted for use in this 

* The Math Attitude Scale has been de- 
posited with the American Documentation 
Institute. Order Document No. 6545 from 
ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress; 
Washington 25, D. C., remitting in advance 
$1.25 for microfilm or $1.25 for photo- 
copies. Make checks payable to Chief, 
Photoduplication Service, Library of Con- 
gress. 
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study. Intellective measures employed were 
three of the Differential Aptitude Tests— 
Verbal Reasoning, Numerical Ability, and 
Abstract Reasoning (Bennett, Seashore, & 
Wesman, 1952)—and the Cooperative Math- 
ematics Pretest for College Students (Math- 
ematical Association of America, 1947), 
high school mathematics averages, and 
final grades in college freshman mathe- 
matics. 

Subjects and Procedures. On the basis of 
their scores on the mathematics pretest, 
administered during orientation week, en- 
tering freshmen at a southeastern college 
who elected their mathematics for the fall 
semester were assigned to general mathe- 
matics, intermediate algebra, or college 
algebra classes. Most of the analyses were 
carried out on the five sections of general 
mathematics. All the data beyond the pre- 
test were collected during the first few 
meetings of the classes, except in the case 
of the second administration of the mathe- 
matics pretest which took place one week 
before the final examinations. Data were 
analyzed primarily by means of multiple 
and partial correlation and regression meth- 
ods. Hypotheses were tested for males and 
females separately inasmuch as a pilot study 
had indicated the possibility of sex differ- 
ences in math attitudes. 


RESULTS 


Math Attitudes and Achievement Meas- 
ures 


Final Course Grades. Multiple ‘re- 
gression analyses of the predictive 
value of the Math Attitude Scale were 
made with the 60 males and 67 females 
taking general mathematics. On the 
basis of the intercorrelations among 
the five intellective predictor variables 
and the criterion variable (final grades), 
regression analyses were restricted to 
the independent variates of high school 
mathematics average, DAT Verbal 
Reasoning, DAT Numerical Ability, 
and the Math Attitude Scale. The 
multiple correlation coefficients were 
67 and .63 for males and females, 
respectively (p < .O1). For cross- 
validation, the predictor equations for 
students in general mathematics were 
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TABLE 1 


Tests oF SIGNIFICANCE oF ParTIAL RE- 
GRESSION COEFFICIENTS IN MULTIPLE 
REGRESSION OF DIFFERENTIAL 
AptitupE Test VARIABLES, 

Mats aNnp HiGH 
ScHoot Mata AVERAGE ON 
Final GRADES IN 
GENERAL 


Values of ¢ 


Variable 
Males | Females 
(N = 6) | (N = 67) 
High School Math Av- | 4.12** 1.23 
erage 
Math Attitude Scale 1.13 3.16** 
DAT Verbal Reasoning 4.07** 91 
DAT Numerical Ability | 5.09** | 2.01* 


* Significant beyond the .05 level 
** Significant beyond the .01 level 


applied to the scores on the independ- 
ent variable’ of students taking alge- 
bra. The obtained and predicted grades 
in algebra correlated .69 for the 42 
males and .65 for the 20 females. Tests 
of significance of the partial regression 
coefficients in the original equations 
showed that, for the males, all variables 
except Math Attitude made significant 
contributions. Only Math Attitude 
and DAT Numerical Ability played 
significant roles in the predictor equa- 
tions for the females (see Table 1). 
Thus, the hypothesis of significant 
contribution of math attitudes to 
prediction of achievement is borne out 
for females, but not for males. 
Achievement Test Changes. In the 
part of the study relating to gains in 
mathematics achievement test scores, 
test and retest data on the mathe- 
matics pretest were obtained on 52 
males and 63 females in the three 
mathematics courses. The partial cor- 
relation coefficients between Math 
Attitude Scale scores and retest scores 
on the mathematics pretest, partialing 
out the effects of initial scores on the 
latter, were .33 for males and .34 for 


females. Both of these coefficients are 
significant beyond the .02 level. As 
hypothesized, Math Attitude Scale 
scores predicted gains in scores on the 
mathematics pretest. 


Math Attitudes and Personality Meas- 
ures 


Temperament. As a test of the hy- 
pothesis of the unrelatedness of “gen- 
eral personality” (or temperament) 
variables to math attitudes, multiple 
regression analyses were carried out, 
including the seven part-scores of the 
MCI with the DAT Verbal Reasoning, 
Numerical Ability, and Abstract Rea- 
soning tests, and with a criterion of 
Math Attitude Scale scores. The larg- 
est portion of the variance, for both 
sexes, was accounted for by the regres- 
sion of DAT Numerical Ability on 
Math Attitude Scale scores (see Table 
2). However, MCI Leadership, for 
males, was significantly correlated with 
math attitudes (r = —.21, p < .05). 
As Table 2 indicates, there is a slight 
suggestion that females with good 
“adjustment to reality’’ have more 
positive feelings toward mathematics 
than those with poorer adjustment. 
For males, leadership qualities and 
positive math attitudes may be re- 
lated. In either case, the significant 
relation could be a chance result. 
Considering the general lack of cor- 
relation, it may be concluded that the 
hypothesis of unrelatedness of tem- 
perament measures and math attitudes 
is confirmed, though obviously not 
proved. 

Ability. Confirmation of the hy- 
pothesis that math attitudes are 
positively related to numerical ability 
is found in the significant partial cor- 
relations between DAT Numerical 
Ability and Math Attitude Scale scores 
when the effects of the other two DAT 
tests are partialed out. These coeffi- 
cients are .23 for 87 females and .51 
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TABLE 2 

ANALYsIS OF REGRESSION OF MINNESOTA 

CouNSELING INVENTORY AND DIFFEREN- 
TIAL AptiTUDE Tests ON MatTu 

ATTITUDE ScALE ScoREs 


Source of Variation df MS | F 


Regression due to 10 |10| 907.629| 4.17** 
variables 
Regression due to DAT | 1 5482.898/25.19** 
Numerical Ability 
Regression due to9 var- | 9, 399.265) 1.83 
iables (omitting Nu- 
merical Ability) 
Regression due to 8 var- 8 339.930 1.56 
iables (omitting MCI 
Leadership and DAT 
Numerical] Ability) 
Error 85) 217.675 


Females 
(N = 87) 


Regression due to 10 |10) 855.098 3.70** 
variables 
Regression due to DAT | 1'4847.482 20.98** 
Numerical Ability 
Regression due to9 var- | 9 411.499 1.78 
iables (omitting Nu- | 
merical Ability) 
Regression due to8 var- | 8 253.612 1.10 
iables (omitting MCI 
Adjustment to Real- | 
ity and DAT Numer- 
ical Ability) 
Error 76, 231.025, 
for 96 males in the three mathematics 
courses. A significant difference (p < 
01) between the two coefficients 
suggests that for females individual 
differences in Verbal Reasoning and 
Abstract Reasoning make important 
contributions to the determination of 
attitudes toward mathematics. 


Math Attitudes and Experience with 
Mathematics 


Ratings of Mathematics Teachers. The 
first portion of Table 3 presents evi- 
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dence concerning the predicted positive 
relation between students’ ratings of 
their former mathematics teachers and 
their own math attitudes. Correlations 
are between scores on the rating scales 
of the IPDS, referring to remembered 
characteristics of teachers, and the 
Math Attitude Scale scores. The plus 
and minus signs in parentheses after 
the names of the variables in Table 3 
indicate high and low ends of the 
scales, respectively. Thus, the value 
of .34 for males, opposite the Patient 
vs. Impatient scale, evidences a 
significant positive relation between 
positive attitudes toward mathematics 
and reported patience in previous 
mathematics teachers. It is noteworthy 
that more of the items are significant 
for females than for males, though 
none of the differences between the 
coefficients for males and females is 
statistically significant. The number of 
significant coefficients probably does 
not arise by chance. Math attitudes 
are thus apparently related to remem- 
bered impressions of teachers, the 
female more clearly so than the male 
attitudes. 

Parental Encouragement. The ob- 
tained correlations between the Math 
Attitude Scale and IPDS rating scale 
variables used in a test of the hy- 
pothesis of a positive relation between 
attitude and parental encouragement 
in mathematics and study in general 
are listed in the second section of 
Table 3. Since none of the eight cor- 
relations is significantly different from 
zero, there is no evidence that math 
attitudes are related to memory of 
parental encouragement of studying 
mathematics or academic subjects in 
general. 

Parental Attitudes toward Mathe- 
matics. To obtain data on the hy- 
pothesized relation between students’ 
math attitudes and their reports of 
their parents’ attitudes toward mathe- 
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TABLE 3 
CORRELATIONS BETWEEN ATTITUDE 
ScaLe Scores aNp Scores on Irems 
RELATED TO STupENT IMPRESSIONS 
or MatTHematTics TEACHERS AND 
EXPERIENCES WITH PARENTS 


Correlation with 
Math Attitude Scale 


Males | Females 
y = 96) |\(N = 87) 


Your impressions of your math 
teachers: | 
Patient (+) vs. Impatient (—) 
Strict (—) vs. Lenient (+) 
Hostile (—) vs. Friendly (+) 
Fair (+) vs. Unfair (—) 
Demanded high standards (+) | 
vs. Did not care (—) 
Domineering (+) vs. Submis- 
sive (+ 
Lots of fun (+) vs Grim (- 
Brutal (—) vs. Kind (+) 
Clever (+-) vs. Dull (—) 
Nervous (—) vs. Controlled (+) | 
Knew their subject well (+) vs 
Were severely lacking in | 
knowledge of subject (—) 
Really knew how to teach math | 
(+) vs. Did not know any- 
thing about how to teach 
math (—) 
As you experienced your father 
when you were a child: 
Stressed my school work greatly 
(+) vs. Paid no attention to 
my school work (—) | 
Encouraged me to study math | 
(+) vs. Discouraged me from 
studying math (—) 
As you experienced your mother 
when you were a child: 
Stressed my school work greatly 
(+) vs. Paid no attention to 
my school work (—) 
Encouraged me to study math 
(+) vs. Discouraged me from 
studying math (—) 


* Significant beyond the .05 level 
** Significant beyond the .01 level 


matics, Math Attitude Scale scores 
were correlated with the scores on the 
following scales of the IPDS: 


As you experienced your father when you 
were a child: Liked math vs. Disliked math 

As you experienced your mother when 
you were a child: Liked math vs. Disliked 
math 


The correlations between the Math 
Attitude Scale and the two variables 
above were .08 and .10, respectively, 
for the 96 males and .13 and .16 for 
the 87 females. Since these coefficients 
are not significantly different from 
zero, the hypothesized relation between 
students’ reported perceptions of the 
parents’ feelings toward mathematics 
and the students’ own attitudes toward 
the subject is not confirmed. 

Traumatic Experiences with Mathe- 
matics. The answers to the following 
questions on the IPDS were used to 
test the hypothesized relation between 
attitude and traumatic experiences 
with mathematics: 


Can you remember any specific embar- 
rassment or insecurity pertaining to your 
performance in arithmetic or math when 
you were a child? If so, describe. How old 
and in what grade were you when this took 
place? 


A chi square test of independence 
between Math Attitude Scale scores, 
dichotomized at the median, and a 
Yes-No answer to the first question 
gave values of .62 and 1.11 for the 
96 males and 87 females, respectively. 
Since neither value is significant, the 
hypothesis of independence between 
the two variables cannot be rejected. 
There were not ‘enough responses to 
the remaining items to make a statisti- 
cal analysis. 


DISCUSSION 


Assuming that the various measures 
employed for this study accurately 
assess their respective variables, we 
may conclude that thedirect experience 
hypothesis which guided the investiga- 
tion overali is partly supported, but 
only partly. 

Math attitudes are apparently re- 
lated to intellective factors and achieve- 
ment, but not to temperament vari- 
ables, at least within the limitations of 
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this study. Experiences with former 
mathematics teachers are somewhat 
related to present math attitudes, 
although remembered parental en- 
couragement ard math attitudes and 
own remembered traumatic experi- 
ences are presumably unrelated. 

The partly unrealistic nature of the 
assumption of validity of measures 
employed in this study is easily seen. 
Final grades, for example, are not 
entirely adequate to assess achieve- 
ment in mathematics. The group tem- 
perament inventory suffers from the 
well-known failings of such measures. 
Within the limitations of the measures, 
however, the conclusion of the influ- 
ence of direct experience upon math 
attitudes may be upheld. 

Other research will have to be under- 
taken to determine if factors not 
investigated here are operative in 
determining math attitudes. At best 
the proportion of variance of the 
variables here associated with math 
attitude variance is small. Other 
variables must be uncovered whose 
variance further accounts for that in 
the Math Attitude Scale. Perhaps a 
less direct method of ferreting out 
traumatic experiences with mathe- 
matics, for one instance, would reach 
some of the bases of mathemaphobia 
and other negative attitudes. Projec- 
tive measures or objective inventories 
other than those we have employed 
might reveal relations not found in 
this study. 


SUMMARY 


The Math Attitude Scale, DAT 
Numerical Ability, Verbal Reasoning, 
and Abstract Reasoning tests, the 
Cooperative Mathematics Pretest for 
College Students, the Minnesota Coun- 


seling Inventory, and an adaptation of 
the Intensive Personal Data Sheet were 
administered to college freshmen in 
mathematics courses to determine in a 
limited way the etiology of math 
attitudes. Regression and correlation 
analyses of the intercorrelations of 
these measures and their relations with 
grades in high school and college 
mathematics courses supported to a 
modest extent the supposition that 
direct experiences in relation to mathe- 
matics contribute to math attitudes. 
Other influences in the determination 
of negative math attitudes are not 
excluded by the findings in this study. 
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THE RELATION OF STUDENTS’ NEEDS}TO THEIR 
PERCEPTIONS OF A COLLEGE ENVIRONMENT: 


ANNE McFEE* 


Syracuse University 


Over the past 3 years the College 
Characteristics Index (CCI) (Stern 
& Pace, 1958) has been filled out by 
several thousand students in more 
than a hundred colleges (Pace, 1960; 
Pace & Stern, 1958; Thistlethwaite, 
1959). The instrument is intended to 
give an estimate of the press of the 
college environment. The 30 press 
scales in the CCI parallel the 30 needs 
scales in the Stern Activities Index 
(AI) (Stern, 1958). In using and 
interpreting the instrument, it is 
clearly important to know whether 
the personality of the students who 
answer its items has any appreciable 
relationship to the way they answer 
them. 

McConnell and Heist (1959), noting 


that the personality characteristics of 
student bodies vary widely from one 
college to another, and even between 
colleges which are highly selective in 
scholastic aptitude, have raised the 


question: “Do students make the 
college?” If one is to study the inter- 
action between students and environ- 
ments one must have independent 
estimates of each. The CCI should 
give an estimate of the environmental 
press independent of the personality 
needs of the students responding to it. 
Does it in fact do so? This study at- 
tempted to answer this question on 
two levels: the general relation between 
corresponding need and press measures, 
and the specific relation of each CCI 


1 This paper is based on the writer’s MA 
thesis in Psychology at Syracuse University. 
The author wishes to thank C. R. Pace for 
his extensive suggestions and guidance 
throughout the course of the study. 

2 Now at Stanford University. 


item to a relevant personality need 
scale. 

Two other factors were studied. The 
first is the objectivity of the CCI 
item. The hypothesis here is that the 
more easily verifiable the behavior or 
knowledge the item describes, the less 
likely it is that people will see it in an 
individual way. The second factor 
considered is the likelihood that the 
student has a basis of personal ex- 
perience for saying that the item is 
“true” or “false” of his environment. 
The hypothesis is similar to the one 
relating to the objectivity of the item, 
i.e., the more familiar the students are 
with the behavior described in the 
item, the more they will agree on its 
truth or falsity. When few students 
have experienced the behavior in 
question, there will be more disagree- 
ment. 

The respondent to the CCI is asked 
to report whether certain specified 
behaviors or conditions are true of 
his college environment. It seems likely 
that this situation would tend to 
minimize unique, individual sets, and 
facilitate the expression of opinions 
that have been acquired by members 
of the college group through contact 
with the immediate environment. The 
perceptions of a particular group 
should show a high degree of agree- 
ment, but there should be considerable 
differences among groups. Preliminary 
study of the variance of scale scores 
within and between colleges on an 
earlier version of the CCI has shown 
this to be the case (Pace & Stern, 


1958). 
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‘Perception of the environment” is de- 
fined by the answers students give to the 
300 items of the CCI. Briefly, the index con- 
sists of 30 scales, with 10 true-false items 
in each scale. The items describe common- 
place activities or conditions which occur 
or might occur at college. The scales de- 
scribe 30 types of environmental press. A 
press is an aspect of the environment which 
tends to encourage or reward a particular 
type of behavior. For example, if the item 
“‘Many upperclassmen play an active role 
in helping new students adjust to campus 
life’ is answered “‘true’’ it is presumed to 
reflect a press toward nurturant behavior. 

The personal characteristics of the stu- 
dent examined in this investigation are his 
scores in the AI. The AI also consists of 30 
scales of 10 true-false items each, but it is 
designed to measure level of need in the 
individual, rather than intensity of press 
in the environment. 

The items of the CCI were classified for 
objectivity and exposure value by three 
judges working together. Approximately 2 
weeks later, two of the judges went over 
the classifications. When there was doubt 
about an extreme item, it was shifted toward 
a middle category, in an effort to make 
extreme categories as free as possible from 
ambiguities. In this second session, only 
about 10% of the items were shifted. 

For each item of the CCI, response fre- 
quencies were tabulated for people answer- 
ing in the direction of the key, and in the 
opposite direction. In other words, for each 
item, there was a group of respondents who 
‘passed’? it, and another group who 
‘failed’? it. The mean AI score, on the 
scale corresponding to the CCI scale in 
which the item appeared, was calculated 
separately for the ‘“‘pass’’ and “‘fail’’ groups. 
For each CCI item, ¢ tests were computed 
between AI scale means for the ‘‘pass’’ and 
‘‘fail’’ groups. For a sample of this size, the 
t’s would not be significant if the differences 
between the means were less than .50, so 
t ratios were calculated only for items show- 
ing differences of .50 or above. This mini- 
mum difference was necessary for signifi- 
cance no matter how the ‘“‘pass’’ and “‘fail’’ 
groups were proportioned (97-3, or 50-50). 

Subjects. Responses to both the CCI and 
the Al were obtained from 100 students in 
introductory psychology classes at Syracuse 
University. They are not necessarily a 
representative sample of Syracuse students; 


the study is not intended to provide a 
generalization about a particular univer- 
sity, but only about groups of college stu- 
dents. 


RESULTS 


Perception Related to Personality Needs 


Pearson product-moment correla- 
tions were calculated between each 
pair of scales carrying the same label 
on the AI and CCI, e.g., between 
Need Achievement-Press Achievement. 
These correlations ranged from —.007 
to .057. The median correlation was 
.006. Of the 30 correlations, 24 were 
in a positive direction, and 6 were 
negative. 

For a sample of 100 cases, a Pearson 
r of .197 will be significant at the 5% 
level of confidence. None of these 
correlations is significant. Since on a 
chance basis alone occasional r’s of 
much larger magnitude than those 
obtained would be expected (one or 
two above .197) all computations were 
double checked. It is not likely that 
the lack of demonstrated relationship 
is due merely to low scale reliabilities. 
Reliabilities for the CCI and the AI 
were computed using Kuder-Richard- 
son Formula 20 (Stern, 1959). These 
range from .34 to .81 for the CCI, 
with a mean of .65. For the AI they 
range from .40 to .88 with a mean of 


Response Uniformity Related to Ob- 
jectivity and Exposure Value of 
Items 


To test the hypothesis that easily 
verifiable items will be answered more 
consistently by the students, the 300 
CCI items were classified as (a) highly 
objective, easily verifiable from obvious 
criteria; (b) somewhat objective, veri- 
fiable from criteria requiring an 
observer or otherwise less obvious; and 
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PERCEPTIONS OF A COLLEGE ENVIRONMENT 


TABLE 1 


VARIABILITY OF STUDENT RESPONSES AND RELATION oF StupENT NEEDS 
to CCI Irems VaryInG IN PossIBILITY oF OBJECTIVE VERIFICATION 
AND IN ExposurRE VALUE 


Characteristic 


No. of Items 


% producing % significantly 
divided related to 
responses 


% producing 
uniform 
responses 


Objectivity of Items: 
Highly Objective Items 
Somewhat Objective 
Subjective 


Exposure Value of Items: 
High Exposure 
Medium Exposure 
Low Exposure 


42 


2.6* 


* Significant t's (.05 level) between AI scale means for ‘‘pass"’ and “‘fail’’ groups on each CCI item. 


* Significant at the .05 level. 
** Significant at the .01 level. 


(c) subjective, confirmable only by 
asking more people the same question. 
Response frequencies were compared 
for the three groups. The results are 
shown in Table 1. 

There is a high degree of uniformity 
of response to an item if nearly every- 
one answers it the same way, whether 
this is in the direction of the key or in 
the opposite direction. Of the 62 items 
classified as highly objective, 13 % had 
response percentages of 90 and above 
or 9 and below, compared to 5% of 
the 131 items classified as highly 
subjective. Thus the more objective 
items tend to evoke a somewhat larger 
percentage of highly uniform responses 
than do the subjective items. This dif- 
ference is significant at the 5% level 
(z = 2.0). 

Of the highly objective items, 42 % 
fall in the middle response range of 
response percentages (30 to 70 in 
response percentages), and 65% of the 
highly subjective items fall into this 
response range. The middle range 
represents the responses that are the 


least uniform, since they center on 
the 50%, or completely divided, level. 
A smaller proportion of highly objec- 
tive than highly subjective items falls 
in this range, i.e., produce disagree- 
ments among respondents. This dif- 
ference is significant at the 5% level 
(g = 2.3). 

To examine the hypothesis that 
students will agree more in reporting 
behavior with which they are all 
likely to be familiar, the items were 
classified under three levels of probable 
familiarity or “exposure value”: high, 
medium, and low. 

Of the 51 items classified as high in 
exposure value, 24% had response 
percentages of 90 or above or 9 and 
below, compared to only 3% of the 
130 items classified as low in exposure 
value. The percentage of uniform 
responses to items of high exposure 
value was eight times the percentage of 
uniform responses to items of low 
exposure value. This difference is 
significant beyond the 1% level (z = 
4.2). 
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Of the high exposure items, 41 % fell 
in the middle response range, and 62 % 
of the low exposure items fell in this 
range. A smaller proportion of high 
exposure than low exposure items 
produced disagreement among re- 
spondents. This difference is significant 
at the 5% level (z = 2.6). In this 
respect, there is no difference between 
the objective-subjective and high 
exposure-low exposure types of classi- 
fication. Both produce the same 
proportion of items in the middle 
response range. 


Relation of Responses to Personality 
Need, Objectivity, and Exposure 
Value of Item 


Of the 300 CCI items, 36 (12%) 
showed ¢ ratios, significant at the .05 
level or above, between “pass” and 
“fail” group means on the correspond- 
ing AI scale. 

Of the items classed as highly ob- 
jective, 8 % showed significant ¢ ratios. 
Of the highly subjective items, again 
8% showed significant ¢ ratios. The 
objectivity of an item does not seem 
to be related to the degree of influence 
of personality needs on student re- 
sponses. 

Six percent of the high exposure 
items showed significant ¢ ratios be- 
tween “pass” and “fail” group means 
on the corresponding AI scale. Of the 
low exposure items, 19% showed 
significant ¢ ratios. When a student, 
lacking direct experience, must guess 
at the answer to a question, whether 
objective or subjective in content, his 
own needs are more likely to influence 
his judgment to a significant degree, as 
reflected by these ¢ ratios. The dif- 
ference in proportions of significant 
t’s between the high and low exposure 
items is significant at the 5% level 
(2 = 2.2). 


SUMMARY 


This study attempted to clarify 
some of the relations between student 
perception of the college environment 
and various other factors. If the 
College Characteristics Index is to be 
useful to investigators as an objective 
indicator of differences between col- 
leges, it should be independent 
of the personality needs of the 
informants filling it out. This 
study failed to find any correlation 
between scale scores of individuals on 
the CCI and their parallel scores on 
the AI, a personality test using parallel 
scale classifications; nor was a strong 
relation found between personality 
need and the students’ perception of 
environmental press, as reflected by 
individual items. The responses to 88 % 
of the 300 CCI items were independent 
of the parallel personality need of the 
respondent. 

Differences in objectivity of indi- 
vidual items produced a moderate 
difference in uniformity of response to 
the items, but produced no discernible 
differences in the influence of need on 
the item responses. Items about be- 
havior or conditions which the student 
is unlikely to have encountered (i.e., 
those low in “‘exposyre value’’) pro- 
duced much less agreement, and were 
much more influenced by need, than 
were items about more widely shared 
experience. 
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INHIBITION PHENOMENA IN FAST AND 
SLOW LEARNERS 


JAMES B. STROUD 


State University of Iowa 


The purpose of this study was to 
investigate intralist and interlist in- 
hibition as a function of length of list 
and level of ability (rate of acquisition). 
Two groups of subjects, widely sep- 
arated in rate of learning, were com- 
pared both on learning and recall of 
lists of two different lengths, in which 
all items by all subjects were learned 
to a common criterion of two and only 
two reinforcements. By definition, 
slow learners require more trials, more 
item presentations, than fast learners. 
They require more presentations to 
achieve the initial correct responses in 
the course of learning a list. It is also 
possible that the stability of correct 
responses, as determined by the prob- 
ability of their occurring twice in 
succession, once they have occurred, 
may vary with ability level. On long 
lists, all subjects require more item 
presentations to achieve the initial 
correct responses than they do on 
short lists (Robinson & Darrow, 1924; 
Robinson & Heron, 1922). 

Also it is possible that the increasing 
of length would affect the stability of 
responses of all subjects, but that of 
slow learners more than of fast learners. 
Possibly fast learners behave on long 
lists much as do slow learners on short 
ones. One of the more important dif- 
ferences between the two ability groups 
may lie in a difference in their ability to 
withstand the effects of intralist in- 
hibition. 

As stated thus far, the experiment 
was so planned as to: (a) investigate 
the relative effects of length upon 
trials required to learn by two widely 
separated ability groups, and (6) 
compare recall scores of the two groups 
of subjects on two different lengths of 


AND 


LAMORE J. CARTER 
Grambling College 


lists: under the condition in which all 
individual items on both lists by all 
subjects were learned to a common 
operational criterion. 

As an attempt to further the purpose 
of this investigation, certain items just 
learned as a warm-up exercise were 
interspersed both in the short and the 
long list. It seemed worthwhile to 
know whether or not the retention of 
the warm-up items, when presented in 
a larger context, would vary with 
ability level and length of list.’ This 
amounts to what appears to be a 
procedure for investigating retroactive 
inhibition as a function of length of 
list and of ability level. 

The question of differences by ability 
level in susceptibility to retroactive 
inhibition seems rather important. In 
two recent investigations, no relation- 
ship was found between rate of ac- 
quisition and recall (Stroud & Schoer, 
1959; Underwood, 1954). If ability 


. differences are associated with resist- 


ance to inhibition as just suggested and 
if this should turn out to be a general 
phenomenon, then slow learners should 
be more susceptible to retroactive 
inhibition effects than fast learners, 
and should, in terms of widely accepted 
theory of forgetting, retain less well 
what they learn. Perhaps all this is a 
bit tenuous, but it does suggest some 
outcomes that would appear to be 
incompatible with the experimental 
data reported in the two investigations 
just mentioned. 


! This procedure appears to be an efficient 
method of investigating both proactive and 
retroactive inhibition—particularly the 
latter since it very largely gets rid of the 
troublesome problem of rehearsal. 
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PROCEDURE 


Preliminary learning tasks were em- 
ployed for purposes of selecting two groups 
of learners, one fast and one slow. A list 
of 12 paired adjectives and one of 10 paired 
picture-names were used. The latter con- 
sisted of pictures (faces and shoulders) of 
male college students and fictitious first 
and last names. These were placed on film 
strips and projected upon a screen before 
classes of college sophomores. Throughout 
the first trial, a picture or an adjective, as a 
stimulus member, was exposed for 2 seconds; 
then both members of a pair were simul- 
taneously exposed for 2 seconds. When all 
the items in the list were thus exposed, there 
followed a recall trial in which the stimulus 
members alone were exposed, 4 seconds for 
the adjectives and 6 seconds for the pictures. 
During this time the subjects wrote in 
specified blank spaces such of the responses 
as they could recall. Two exposure and 
recall trials were allowed on the paired 
adjective list and three on the paired pic- 
ture-name list. Total number of correct 
responses on the two lists combined con- 
stituted the scores. 

By this means, two groups of 32 subjects 
each were selected, representing the top 
and bottom 15% of the group sampled. 

In the experiment proper, two lists of 
paired adjectives of average associative 
and familiarity values from the Haagen 
(1943) list were used. Three separate lists 
of 12 items each were prepared. First one, 
then another, of these lists was used as a 
12-item list, each being used an equal num- 
ber of times for this purpose. The two re- 
maining lists were combined to form a 24- 
item list, which was learned as such. The 
three blocks of 12 items were systematically 
rotated throughout among the 12- and 24- 
item lists, in an attempt to contro! possible 
differences in item difficulty between the 
long and short lists. 

The adjective pairs were inscribed on 4 X 
6 plastic cards, a pair per card, appropriate 
for use in the Card Master. In the experi- 
ment, the first member of a given pair was 
exposed for 2 seconds, following which both 
members were simultaneously exposed for 
2 seconds. As learning proceeded, subjects 
were required to anticipate verbally the 
response member within the 2-second inter- 
val in which the stimulus member alone was 
exposed. A trial consisted of a single expo- 
sure, in this manner, of all the items in a 
list. The serial order of the cards was varied 
systematically throughout the learning 
trials. Each subject learned both the long 
and the short list, the order alternating 


from subject to subject. An interval of 4 
seconds was interspersed between trials. 

Each subject, just prior to undertaking 
each of his learning tasks, learned a warm- 
up list consisting of three pairs, by the 
method just described. Each such list was 
learned to a criterion of five errorless trials. 
For purposes stated presently, each of the 
three warm-up pairs, when learned, was 
placed in the appropriate list to be learned. 
Operationally this increased the length of 
each list by three items. On the short list 
the three warm-up items were placed in the 
fifth, tenth, and fifteenth positions; on 
the long list, at the ninth, eighteenth, and 
twenty-seventh positions. 

As a further condition of the experiment 
each item, except the warm-up items,? was 
withdrawn, from the list when and as it was 
correctly responded to twice, consecutively 
or not. This procedure was adopted for two 
reasons: it had the effect of shortening 
learning time; it insured the same number 
of operationally defined reinforcements on 
all items of both lists by all subjects. The 
latter seemed especially important for 
investigation of the effect of length and 
ability differences upon retention. 

Subjects appeared, individually, at ap- 
pointed times, learned the appropriate 
warm-up task and the appropriate main 
experimental task. After an interval of 48 
hours, they reappeared, engaged in a recall 
performance on the original experimental 
task, under the same conditions as those 
under which it was learned, and proceeded 
to learn the second warm-up task and the 
second main experimental task. For this 
task, a recall performance was exacted 48 
hours later. 


RESULTS 


The means of the number of trials 
required to learn the short and the 
long lists by the two ability groups are 
presented in A of Table 1. There is 
little room for doubt that the selection 
procedure used produced two groups of 
subjects far apart in learning ability. 
By the procedure of deleting items 
from the list as soon as they had been 
twice responded to successfully, trials 
became shorter as learning progressed. 
By number of trials is meant the 


? The Card Master does not work well 
when fewer than four cards are in the ma- 
chine. 
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TABLE 1 
LEARNING AND 


Subjects 
Task Score 


Fast Slow 


A. Means of Trials to | 
Learn 
Short List 
Long List 
. Mean Number of Con- | 
secutive Successful 
Anticipations 
Short List 
Long List 
. Mean Number of Un- | 
successful Anticipa- 
tions of Warm-up 
Items 
Short List 
Long List 
. Means of Recall Scores | 
Short List 
Long List 


number of times a list or some part of 
it was presented. 

Under this revised-list method, the 
increasing of the length of list resulted 
in the increasing of the number of 
trials required to learn and did so at 
_ an accelerating rate. This is in keeping 

with results on length obtained by the 
conventional method of learning the 
list as a whole to a criterion (Robinson 
& Darrow, 1924; Robinson & Heron, 
1922). 

The increasing of length resulted in 
a greater increase in number of trials 
required to learn by the slow learners 
than by the fast learners—78.88 to 
27.22. An analysis of variance pro- 
cedure yielded a significant Length x 
Ability interaction (p < .005). Inci- 
dentally, the Length x Order inter- 
action was also significant at a like 
level, suggesting that the practice 
effects from the long list upon the 
short list were greater than those of 
the short list upon the long one. 

The data were next analyzed with 
respect to the number of times correct 
responses once made occurred twice in 


succession, for the two ability groups 
and the two lengths. In this analysis 
the three interspersed warm-up items 
were counted, with the result that the 
maximum number of times two con- 
secutive correct responses could have 
been made was 15 and 27, for the two 
lists. The results are presented in B, 
Table 1. 

On both lists, the number of con- 
secutive successful responses made by 
the fast group was somewhat greater 
than that made by the slow group. 
This is consistent with Underwood’s 
(1954) observation that the reinforcing 
of a slow learner’s response contributes 
less to habit strength than that of a 
fast learner. However, the relative 
effect of list length upon the two groups 
was about the same. 

The mean number of unsuccessful 
anticipations of the three previously 
learned items (warm-up) interspersed 
in the lists is presented in C, Table 1, 
by ability level and by length of list. 

Ability differences were significant 
(p < .005). Ability x Length inter- 
action was not significant. The dif- 
ferences associated with length were 
not significant. It seems clear that the 
fast learners could better withstand 
the interfering effects resulting from 
the presentation of the warm-up items 
within a context of similar items than 
could the slow learners. It may be that 
with longer warm-up lists, in the sense 
used here, significant length effects and 
Length < Ability interaction would 
have been obtained. 

Recall scores of the two groups of 
subjects on the two lengths were com- 
pared. D, Table 1, shows the mean 
words recalled—within a 2-second 
exposure interval, 48 hours after 
learning—for these comparisons. 

Investigations of the relation be- 
tween list length and recall have ob- 
tained higher scores in percent re- 
called for the longer lists, when entire 
lists were learned to a common criterion 
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(Robinson & Darrow, 1924; Robinson 
& Heron, 1922). This has generally 
been explained in terms of a relatively 
greater overlearning of some of the 
individual items in the longer lists than 
in the shorter ones. In this experiment, 
all items were learned to the same 
objective criterion. 

Long lists require more trials to 
learn, but when learned are retained 
as well as short lists learned in fewer 
trials. Consistent with this is the 
proposition that slow learners require 
more trials to learn lists of all lengths 
than do fast learners, but when learned 
retain them as well as do fast learners. 
The over-all ability effect was not 
significant. However, separate analysis 
gave a significant difference between 
the two ability groups on the long list. 
The Length X< Ability interaction was 
significant (p = .05). 

Perhaps the order effects on recail 
are worth mentioning in this con- 
nection. In all of the foregoing analyses, 


order effects (order in which the lists 
were learned) were determined. Gen- 
erally, these were large and significant. 
Also, order interactions as Order X 
Ability, Order X Length were signifi- 
cant. In recall, there were no order 
effects. 


Discussion 


Probably no one doubts that there 
are differences among people in recall or 
in other measures of memory. In most 
practical life situations, differences in 
degree of learning, in familiarity, in 
subsequent utilization of learning, 
rehearsal, and others operate. These 
have an effect upon recall. There 
appears to be no compelling reason to 
posit some kind of differences inherent 
in basic psychological process in order 
to account for the observed differences. 
However, one possible basic difference 
in psychological process does suggest 
itself. 

There is some evidence in the results 


of the present investigation that slow 
learners are more susceptible to inter- 
list interference than fast learners. 
This suggests that they may be more 
susceptible to the effects of retroactive 
inhibition, in the traditional sense. Our 
findings relative to ability differences 
in the recall of the warm-up items 
(C, Table 1), we think, support this. 
If various kinds of interfering effects do 
affect slow learners more adversely 
than fast ones, and do so generally, 
we should certainly expect recall ability 
to be related to learning ability. This 
assumption follows from the fact that 
retroactive inhibition or interference 
effects is our principal explanation of 
loss in ability to recall learned material. 

As already noted, some recent work 
(Stroud & Schoer, 1959; Underwood, 
1954) suggests that differences in recall 
may be unrelated to differences in 
learning ability. At least this was 
found to be the case in the experiments 
in question. At this stage it would 
seem unwise to generalize very far 
about this. In the experiments just 
referred to, subjects learned lists of 
nonsense syllables, paired adjectives, 
and picture-names, and presumably 
went about their normal business during 
the time between learning and recall. 
In these respects the materials and 
general procedure were not different 
from those employed by earlier workers 
who (by reason of the erroneous use of 
relearning as a measure of retention— 
Stroud & Schoer, 1959) concluded 
that retentive ability is positively 
related to learning ability. 

In the course of normal events, 
college students would encounter little 
material between learning and recall 
calculated to interfere with the recall 
of the learned material. At least inter- 
ference effects should be at a minimum. 
In any event, it would be interesting 
to ascertain whether or not differences 
in learning ability would be found to 
be associated with differences in recall, 
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were appropriate interpolated learning 
systematically introduced. Conceivably 
retention and learning ability may be 
related after all. 

There is, however, one disquieting 
thought: a list of nonsense syllables, 
learned in a psychological laboratory 
ought to be one of the best remembered 
things in the world. Hardly anywhere 
else would one encounter so few things 
within a 24- or 48-hour retention period 
to interfere with the recall of the 
learned material. Obviously, such 
material is forgotten at a rapid rate. 
Incidentally, the fact that conventional 
laboratory material is forgotten so 
readily under conditions which from 
the standpoint of interference should 
be highly favorable to its retention, 
suggests that there may be factors 
other than retroactive inhibition op- 
erating in forgetting. 

Loss in availability of response may 
in such cases contribute to the rela- 
tively rapid rate of forgetting, es- 
pecially when the subject has only a 
short interval in which to make a 
response. Perhaps loss of warm-up 
effects, or loss of appropriate mental 
set, which warm-up exercises may 
help to restore, are involved. Again, 
this phenomenon may be especially 
important in the recall of nonsense 
syllables or pairs of unrelated words, in 
comparison with the recall of meaning- 
ful material, where the subject is 
frequently able to make use of a rich 
complement of associative cues. 


SUMMARY 


Intralist inhibition as a function of 
ability level and length of list has been 


JAMES B. STROUD AND LAMORE J. CARTER 


investigated. Two groups of subjects 
widely separated in learning ability 
learned and recalled (after 48 hours) 
two lists of paired adjectives of 12 
and 24 pairs. All items on both lists 
by all subjects were learned to a 
criterion of two and only two correct 
anticipations. Increasing list length 
resulted in a disproportional increase 
in number of trials required to learn, 
but in a proportional increase in the 
number of words recalled. Over-all, 
ability differences in recall were not 
significant. However, they were sig- 
nificant on the long list. The interspers- 
ing of previously learned items, re- 
ferred to as warm-up items, within 
the two lists resulted in ability dif- 
ferences in loss of response. The 
results are discussed in relation to 
ability differences in recall. 
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CONSISTENCY AND WISDOM OF VOCATIONAL 
PREFERENCE AS INDICES OF VOCATIONAL 
MATURITY IN THE NINTH GRADE 
DONALD E. SUPER 


Teachers College, Columbia University 


The practice of inquiring concern- 
ing the vocational preferences and 
ambitions of junior and senior high 
school students in helping them to 
make educational or prevocational 
choices and plans is widespread and 
based on compelling arguments: it 
would be unthinkable, in a democratic 
society, to require the determination 
of the directional choices which stu- 
dents make in selecting college pre- 
paratory, commercial, trade, home 


economics, agriculture, and general 
courses, without taking into account 
the goal to which the youth aspires. 
High school guidance questionnaires 
to be filled out by entering pupils ask 
what occupation they hope to enter, 


and numerous studies have been made 
of ways in which to elicit information 
concerning vocational preferences 
(Beilin, 1952; Gilger, 1942; Hamburger, 
1958; Trow, 1941). 

The consistency, and particularly 
the wisdom or realism, of vocational 
preferences have often been used as 
measures of the effectiveness of vo- 
cational guidance programs by practic- 
ing counselors and by educational 
and psychological research workers 
(Froelich, 1949; Williamson & Bordin, 
1941). Having a vocational objective is 
important in a society in which earning 
a living is important, in which occupa- 
tional roles are of major significance, 
and in which education is, in effect if 
not avowedly, occupationally oriented: 
having a vocational preference, in 
this context, gives purpose to behavior 
and makes possible educational and 
vocational decisions. It has been 
argued that consistency of vocational 


preferences shows intensity and validity 
of interest, and that it is better to work 
consistently toward one clear-cut goal 
than wastefully to keep shifting 
objectives. In the case of wisdom of 
vocational preferences, the reasoning is 
that realistic goals are by definition 
attainable, whereas unrealistic or 
unwise goals are by definition those 
which one is not likely to attain or 
with which one is not likely to be 
satisfied if one does attain them. 

Studies using consistency of voca- 
tional preferencesare not very common, 
this being typically a counselor’s 
method of judging how seriously to 
take a student’s expressed preference, 
but the method is illustrated by 
Rothney’s recent Wisconsin Guidance 
Study (1958). 

Wisdom or realism of vocational 
preferences has been used as a criterion 
of the effectiveness of vocational guid- 
ance programs in many studies, e.g., 
Sparling (1933), Kefauver and Hand 
(1941), Stone (1949), Rothney and 
Roens (1950), Hoyt (1955), Rothney 
(1958), and Hewer (1959). Indices of 
realism are often used by counselors in 
judging students’ and clients’ need 
for guidance. 

Despite the widespread acceptance of 
the importance of consistency and 
wisdom of vocational preferences, 
many writers on vocational guidance 
and on vocational development have 
questioned the significance of expressed 
vocational preferences in early adoles- 
cence. Fryer’s (1931) review pointed 
clearly to the conclusion, confirmed by 
Carter’s later review (1944), that the 
expressed preferences of boys and 
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girls in their early and middle teens 
are unstable. More recently still, 
Schmidt and Rothney (1955) reported 
convincing evidence on the instability 
of expressed vocational preferences 
from one year of high school to the 
next and into the first year out of 
school: only 49% of the “choices” of 
the tenth grade remained the same in 
eleventh grade, and this figure was 
reduced to 35% in the twelfth grade, 
and to 24% in the year following 
graduation. 

At the same time, it must be noted 
that expressed preferences do have 
practical significance when viewed from 
a certain vantage point and at certain 
ages. Dyer (1939) showed that, among 
college students, vocational preferences 
which had been constant over a period 
of years were related to subsequent 
occupational choice. Strong (1955) and 
McArthur and Stevens (1955) reported 
that, among students at Stanford and 
Harvard, expressed vocational pref- 
erences have considerable predictive 
value for adult occupation. As Dyer’s 
study involved preferences which were 
constant over a long period of years 
(omitting all cases in which change of 
preference had taken place) and the 
others dealt with expressions of pref- 
erence in late adolescence and early 
adulthood among students of superior 
intellectual, educational, and socio- 
economic status, they do not conflict 
with the studies of more heterogeneous 
groups of early adolescents which have 
already been cited. 

In view of the persistent use of the 
concepts of consistency and wisdom 
of vocational preferences despite their 
tendency to be unstable in early adoles- 
cence, it is important to examine their 
psychological significance. In the Ca- 
reer Pattern Study of the Horace 
Mann—Lincoln Institute of School 
Experimentation several'such measures 
were developed (Super & Overstreet, 


1960) and related to other variables in 
the ninth grade. 
METHOD 

Subjects 

The subjects of this study were the core 
group of the Career Pattern Study of the 
Horace Mann-Lincoln Institute of School 
Experimentation, 105 ninth grade boys who 
were found to be typical in age, intelligence, 
socioeconomic status, and other key varia- 
bles of ninth grade boys in Middletown, 
New York, in the early 1950’s. As Middle- 
town itself is an average town on a variety 
of social and economic indices (Super & 
Overstreet, 1960), these boys may be con- 
sidered typical of ninth graders in many 
American communities. All of the CPS 
boys indicated, in their interviews, at least 
one tentative vocational preference. 


Measures of Consistency and Wisdom 
of Preferences 


Three measures of consistency of voca- 
tional preferences were developed, the third 
measure being essentially a combination of 
the first two. The expressions of vocational 
preferences were obtained in tape recorded 
interviews in which the subjects of the 
study were asked “... about your plans 
for the future. What would you like to be 
by the time you’re thirty?’’ This question 
was followed up with nondirective leads 
designed to keep the boy talking about his 
vocational preferences, and with probes 
designed to make sure that relevant infor- 
mation was obtained if not offered spon- 
taneously (Super, Crites, Hummel, Moser, 
Overstreet, & Warnath, 1957). In the anal- 
ysis of the data the first four vocational 
preferences expressed (if that many were 
forthcoming) were tabulated. (It should be 
noted that these are measures of consistency 
of first and alternative preferences at one 
point in time, not consistency of first prefer- 
ences over time.) 

An index of Consistency within Fields was 
obtained by classifying the expressed prefer- 
ences according to the occupational field 
classification developed by Roe and modi- 
fied by Moser, Dubin, and Shelsky (1956). 
The total number of fields into which the 
boy’s preferences fell, minus 1, was his 
index of Consistency within Fields. 

A second index, Consistency within Levels, 
was developed by classifying the same 
preferences according to the modified Roe 
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occupational level scale, the score being the 
total number of levels at which expressed 
preferences fell, minus 1. 

A third index, Consistency within Fam- 
ilies, was developed, by summing the scores 
on the above two indices, an occupational 
family being defined as a combination of 
field and level (e.g., technical occupations 
at the professional level constitute a family 
by this definition). 

Four measures of wisdom or realism of 
vocational preferences are relevant to this 
study. Although no one of these can be 
viewed as a sufficient index of wisdom of 
choice, each of them involves a variable 
which is widely accepted and objectively 
justifiable as one measure or criterion of 
realism. 

The first index of wisdom or realism was 
one of Agreement between Ability and Prefer- 
ence, or, more accurately, of agreement 
between the measured intelligence of the 
individual and the intelligence character- 
istic of persons employed in the occupa- 
tion of his first preference. If the boy’s score 
on the Otis Quick-Scoring Mental Ability 
Test, converted into an AGCT equivalent, 
exceeded that of the bottom quarter of the 
men in his preferred occupation as shown 
by the manual for the Army General Classi- 
fication Test, his ability was considered to 
be in agreement with the occupational intel- 
ligence requirements; if it was equal to that 
of men in the bottom quarter, his ability 
was considered not to be in accord with the 
requirements. 

The second wisdom measure was the 
index of Agreement between Measured Inter- 
ests and Preference. Interests were measured 
by the Strong Vocational Interest Blank, 
slightly modified to insure comprehension 
at the ninth grade level, and each boy’s 
interests were classified as primary, second- 
ary, or tertiary in the family in which his 
first preference fell, using Darley’s (1941) 
method of classifying interest score pat- 
terns. When the measured interest pattern 
was primary in the field corresponding to 
the expressed preference, a score of 4 was 
assigned, when the pattern was secondary 
the boy was given a score of 3, etc. 

The third wisdom index was that of 
Agreement between Occupational Level of 
Measured Interests and Level of Preference. 
The socioeconomic level of the boy’s inter- 
ests (occupational interest level) was meas- 
ured by the Strong OL scale. If the boy’s 
OL score was not more than one standard 
deviation below the mean OL score of his 
preferred occupation, as shown on Table 50 


in Strong’s monograph (1943, p. 192), the 
interest and preference levels were con- 
sidered to be in agreement. 

The final wisdom measure to be discussed 
here is that of the Socioeconomic Accessi- 
bility of Preference. The family bread- 
winner’s occupation was rated according 
to the Hamburger revision (1958) of the 
occupational rating scale included in the 
Index of Status Characteristics (Warner, 
Meeker, & Eells, 1949). The boy’s vocational 
preference was rated on the same scale, the 
agreement or disagreement of the two rat- 
ings was ascertained, and the size of the 
discrepancy was the index of socioeconomic 
accessibility. 


RESULTS 


It should be emphasized that no 
one of these indices is considered com- 
pletely satisfactory, but that there are 
good arguments in support of each of 
them. For example, the occupational 
intelligence data of World War II are 
recognized as an imperfect sample of 
civilian occupations (Super, 1949), but 
they are the best available and have 
proved generally usable; furthermore, 
the use of the first quartile as the cut- 
ting point takes into account the great 
range of intelligence which makes 
possible success in any one occupation. 
In the case of socioeconomic accessi- 
bility, it may be objected that it is 
unwise, in a fluid and democratic soci- 
ety, to judge the realism of a vocational 
preference by its correspondence with 
the status of the parents; but even in 
our democratic society many studies 
show that the occupation entered by 
the child tends to be at the same 
socioeconomic level as that of the 
parent (Super, 1957), and even when 
the child changes socioeconomic levels 
he does it with the help or hindrance 
of his parents’ resources, contacts, 
information, and values. Each of 
these measures of the wisdom of a 
vocational preference is, therefore, one 
of the possible components of a con- 
ceptually more satisfying and valid 
global index of realism, in which 
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TABLE 1 


INTERCORRELATIONS OF INDICES OF CONSISTENCY AND Wispom or NINTH GRADE 
VOCATIONAL PREFERENCES 


Consistency 


Level Family Aptitude Interest Level 


Consistency: 
Within Fields 
Within Levels 
Within Families 
Wisdom in terms of: 
Aptitudes 
Measured Interests 
Measured Interest Level 
Accessibility 


22+ 


16* 
—01 

26** 

08 


81 
06 


—12 
04 
13 


Note.—N = 105 boys. Decimal points have been omitted. 


* Spuriously high because of use of one measure in the other. 


* Significant at the .05 level, one-tailed test. 
** Significant at the .01 level, one-tailed test. 


having the appropriate intelligence 
might, for example, offset not having 
the socioeconomic background which 
would be a help in achieving one’s 
ambitions. 


Agreement among Indices of the Con- 


sistency and Wisdom of Ninth 
Grade Vocational Preferences 


If consistency and wisdom of 
vocational preferences are valid con- 
cepts with which to work in dealing 
with ninth graders, one should find at 
least a moderate degree of agreement 
(correlations of .30 to .50) between 
various measures of consistency of 
vocational preferences, a similar degree 
of agreement among indices of the 
wisdom of the vocational preference, 
‘and some agreement (correlations of 
.20 to .35) between consistency and 
‘wisdom indices. Table 1 reports the 
‘intercorrelations of seven CPS indices 
of consistency and wisdom of voca- 
tional preferences, for the 105 ninth 
grade boys in the core group. 

The indices of Consistency of Field 
and of Consistency of Level are, of 
course, highly correlated with the 
index of Consistency of Families: this 


happens because the last-named index 
is a combination of the first two. Only 
Field and Level in this group of three 
measures are so constructed as to be 
operationally independent of each 
other, so that the correlation between 
these two measures has meaning; it 
is .22, significant at more than the .05 
level but below the .01. This suggests 
a slight tendency for the ninth grade 
boys who are most consistent in aspir- 
ing to occupations which are at the 
same level to be the most consistent in 
aspiring to occupations which are in 
the same general field. The implication 
of this very slight relationship is that 
the concept of “consistency of voca- 
tional preferences’ has minimal mean- 
ing when boys are in the ninth grade, 
although it may take on more signifi- 
cance at a later age. 

The Consistency indices, Table 1 
reveals, are not at all related to Wisdom 
as measured by the index of Agreement 
between Interests and Preferences, 
and generally unrelated to the other 
Wisdom indices. Consistency of Field 
and Consistency of Family (based 
partly on Field) are related, to a low 
and just barely significant degree, to 
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Agreement of Aptitudes and Preference, 
and Consistency of Family is similarly 
related to Accessibility of Preference. 
The correlation of .26 between Con- 
sistency of Field and Agreement be- 
tween Occupational Level of Measured 
Interests and Level of Preference is 
presumably an artifact, as each of 
these indices is in part based on other 
measures which deal largely with 
level: Field involves type of job, and 
some types of jobs are largely at lower 
levels (e.g., Business Contact, Outdoor) 
whereas other types are largely at 
higher levels (e.g., General Cultural, 
Science), and Level of Interests or 
Preferences is by definition and 
operationally a measure of socioeco- 
nomic level. The relationships between 
Consistency and Wisdom shown by 
Table 1 may therefore be viewed as 
negligible. 

The internal agreement among 
the indices of Wisdom of Vocational 
Preference must next be considered. 
Agreement between Aptitude and 
Preference is unrelated to Agreement 
between Measured Interest and Pref- 
erence, or to Agreement between 
Level of Measured Interests and Level 
of Preference; its correlation of .27 
with Accessibility of Preference should 
probably be viewed as an artifact in 
view of the facts that the nonpreference 
variables in the two Agreement meas- 
ures (intelligence and socioeconomic 
status) correlate .27 in this sample, 
and the preference variables (socio- 
economic level of preference and 
intelligence level of preference) in the 
two measures presumably have an 
even higher intercorrelation because 
in each case one variable (preference) 
is scaled on one of two other highly 
intercorrelated variables (socioeco- 
nomic status and intelligence). 

Agreement between Measured In- 
terests and Preference is unrelated to 
the other Wisdom indices, two of the 


intercorrelations being nonsignificant 
and the third (— .32) being the opposite 
direction from that hypothesized. And, 
finally, Agreement between Level of 
Measured Interests and Level of 
Preference is also to be viewed as 
unrelated to the other Wisdom in- 
dices, as the barely significant and 
very low correlation of .17 with Ac- 
cessibility can be attributed to an 
artifact: both components of both 
agreement measures are scaled as to 
socioeconomic level, and the socio- 
economic level of the vocational 
preference is one of the swo measures 
entering into each of the indices. 

The only conclusion that it seems 
legitimate to draw from Table 1 is, 
then, that the few seemingly significant 
intercorrelations may be the products 
of artifacts, and that the various indices 
of consistency and wisdom of voca- 
tional preferences are unrelated to 
each other. This lack of relationships 
suggests a lack of validity as indices 
of anything significant in the vocational 
development of ninth grade boys. 


Agreement between Consistency and 
Wisdom of Preferences and Other 
Variables 

The construct validity of a set of 
measures of the same variable de- 
pends not only upon the intercorre- 
lations of these presumably similar 
measures, but also upon their agree- 
ment with other variables to which 
theory would lead one to expect them 
to be related. If consistency and wis- 
dom of vocational preferences are 
considered to be indices of vocational 
maturity, of the degree of vocational 
development which has taken place in 
an adolescent, then they should be 
related to measures of other character- 
istics which might be expected to 
develop concomitantly with, to result 
in, or to be the product of, vocational 
maturity. 
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TABLE 2 


RELATIONSHIPS OF CONSISTENCY AND Wispom or NINTH GRADE VOCATIONAL 
PREFERENCES TO STATUS AND BACKGROUND VARIABLES 


Age in 
Grade 9 


Socioecon. 
Level 


Pattern 
Interests 


Intell. | School | 


in Gr. 9 Achiev. 


Consistency: 
Within Fields 
Within Levels 
Within Families 
Wisdom in terms of: 
Aptitudes 
Measured Interests 
Measured Interest Level 
Accessibility 


—04 
—03 
—02 


—08 
—12 


—07 
02 
—ll 


—16 
—12 
14 
12 


42** 


14 
.08 
18 


Note.—N = 105 boys. Decimal points have been omitted. 


* Significant at the .05 level, one-tailed test. 
** Significant at the .01 level, one-tailed test. 


The Career Pattern Study therefore 
used or developed a series of measures 
of other variables which could be hy- 
pothesized as related to vocational 
maturity. These were age (except that 
the fact that the negative correlation 
between age and intelligence in any 
one grade—in which the older pupils 
tend to be the retarded and the younger 
tend to be the accelerated—and the 
limited age range may be expected 
to confuse the relationship), socioeco- 
nomic level, intelligence, school achieve- 
ment, patterning of vocational inter- 
ests, emotional adjustment, and peer 
acceptance. 

Socioeconomic level was measured 
by the placement of the family bread- 
winner’s occupation on the Hamburger 
revision of the Warner scale, intelligence 
by the Otis Quick-Scoring Test of 
Mental Ability, school achievement 
by grades in the three constant courses 
(taken by all students) of the ninth 
grade, patterning of interests by the 
application of Darley’s (1941) method 
to scores on Strong’s Vocational In- 
terest Blank, adjustment by Over- 
street’s method of deriving a total 
adjustment score from stories told in 
the Thematic Apperception Test (Su- 
per & Overstreet, 1960), and peer 


acceptance by a modification of the 
Guess Who technique (Super & Over- 
street, 1960). 

The correlations of the Consistency 
and Wisdom indices with these vari- 
ables are reported in Table 2. Only 2 
of the 49 correlations are statistically 
significant and in the hypothesized 
direction, and one of these relation- 
ships is due to an artifact: Accessibil- 
ity is correlated .42 with Socioeco- 
nomic Level, but the latter constitutes 
part of the former. The one presumably 
true relationship is that of Agreément 
between Measured Interests and Pref- 
erences with Patterning of Interests 
(.29): boys whose preferences agree 
with their measured interests tend to 
have clear-cut patterns of measured 
interests, a relationship which makes 
excellent psychological sense and which 
gives one a little confidence in meas- 
ured interests as having some mean- 
ing at the ninth grade level, despite 
their lack of relationship to other vari- 
ables. (But when interests are related 
only to interests one wonders if there 
is perhaps a measurement artifact at 
work here also.) 

Five other correlations in Table 2 
are large enough to be significant, 
but since they are in the unexpected 
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direction and do not make psycholog- 
ical sense they must be attributed to 
chance. In a table of 49 correlations, 
one might find 2 or 3 correlations 
which seem statistically significant at 
the .05 level, and 1 correlation sig- 
nificant at the .01 level, strictly on a 
chance basis. 

The conclusion to be drawn from 
Table 2, as from Table 1, thus appears 
to be that the consistency and wisdom 
of vocational preferences have little 
significance for prevocational and 
vocational choices at the ninth grade 
level. 


Implications 

If the consistency and wisdom or 
realism of a ninth grade boy’s ex- 
pressed vocational preferences lack 
validity, as the Career Pattern Study 
data suggest, does this mean that they 
should be disregarded in practice? The 
fact that consistency and wisdom of 
vocational preferences have little 
meaning at Grade 9 does suggest that 


they should ‘not be used as criteria of 
the need for guidance nor of the 


effectiveness of guidance at that 
stage of development. Presumably at 
this stage the very instability and 
transiency of expressed vocational 
preferences, and perhaps also their 
inadequate factual basis, make their 
apparent consistency and wisdom 
largely a matter of chance. However, 
one would be reluctant to draw the 
conclusion that the preferences them- 
selves, even if inconsistent or unwise, 
should be disregarded, for this would 
odie not only the assumption that 
the counselor knows better than the 
pupil what is appropriate for him, 
but also the assumption that the best 
way to help the pupil to choose and 
plan wisely is to get him to con- 
centrate on data coming from without 
himself rather than to examine his 
self-concept in relation to external 


and impersonal data. The demon- 
strated, even though far from perfect, 
validity of aptitude, achievement, 
and interest test data make it clear 
that the counselor does indeed have 
unique externally derived information 
as to what is appropriate for the pupil. 
But it may well be, as many counselors 
believe and as some research shows, 
that the best way to let this informa- 
tion help the pupil is to aid him in 
assimilating it into his concept of 
himself. This is best done by beginning, 
not with the data, but with the self- 
concept. And the statement of a vo- 
cational preference is one way of 
expressing a self-concept, as Bordin 
(1943) and the present writer (1951) 
have pointed out. 

There are, of course, various ways 
of helping in the exploration of the 
self-concept as manifested in an ex- 
pressed vocational preference. One is 
to help the pupil to choose courses, 
activities, part-time employment, etc. 
in which he may find opportunities to 
see if the preferred role does indeed 
suit him, and to discuss his use and 
evaluation of these experiences with 
him as he participates in them. Another 
is to help him to examine his prefer- 
ence in the interview, relating it to 
his picture of himself as a student, as a 
part-time worker, as a member of a 
family group, etc. In this discussion 
he may be helped to consider the views 
of him held by other persons and the 
picture of himself obtainable from 
records of his performance. This 
discussion can lead to the desirability 
of obtaining other such external data, 
and more testing or more participating 
in courses or activities may be planned 
and the results later reviewed by pu- 
pil and counselor. By thus using ex- 
pressed vocational preferencesas spring- 
boards for exploration and growth, for 
self-evaluation and further planning, 
the counselor proceeds democratically 
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and with respect for his client’s right 
to and need for self-determination, 
while avoiding the error of placing 
undue emphasis on preferences as a 
basis for directional choice. 


SUMMARY 


Questions are raised concerning the 
significance of the consistency and 
wisdom or realism of vocational 
preferences among ninth grade boys, 
and the use of measures of these as 
criteria of the effectiveness of or the 
need for vocational guidance. The 
development and application of several 
such measures in the Career Pattern 
Study are described, and data on the 
construct validity of these indices for 
105 typical boys are reported. The 
failure to find significant relationships 
in the hypothesized directions is taken 
as evidence of the lack of psychological 
and hence of practical educational 
significance of consistency and wisdom 
or realism of vocational preferences at 
this stage of development. It is con- 
cluded that, although they may be 
meaningful at later stages, they should 
not be used as criteria of the need for 
or effectiveness of guidance and coun- 
seling at the ninth grade level. 
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THE COMPARATIVE INFLUENCE OF PUNITIVE AND 
NONPUNITIVE TEACHERS UPON CHILDRENS’ 


This paper reports a portion of a 
research project pertaining to the man- 
agement of childrens’ behavior in 
classroom settings. Because so many 
teachers, especially beginners, verbal- 
ize considerable concern about disci- 
pline and control, we are focusing our 
current research in this area. While 
there is some relevant literature, such 
as that of Sheviakov and Red (1944), 
based upon experience and insightful- 
ness, we have been unable to locate 
any generalizations based upon data 
from research. 

In a previous study by Kounin and 
Gump (1958) specimen-record types of 
observations were gathered of disci- 
pline incidents during the first week 
of kindergarten, focusing upon the 
triad of: a misbehaving child (target), 
a teacher doing something to stop the 
misbehavior, and a watching audience- 
child. Limiting our dependent variables 
to overt behavior we found that teach- 
ers’ techniques of handling a misbe- 
having kindergarten child (target) did 
have different degrees of socializing 
success upon audience-children. A so- 
cializing success was defined as an 
observable reduction of overt misbe- 
havior or an increase in conforming be- 
havior (standing up “even straighter’ 
in line). Control techniques high in 
clarity (defining the deviancy, specify- 

1A version of this paper was presented 
at the American Psychological Association 
meeting, September 5, 1959. It is part of an 
investigation supported in part by a Re- 
search Grant M-1066, from the National 
Institute of Mental Health, United States 
Public Health Service and in part by a 
grant from the College of Education of 
Wayne State University. 
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ing how to stop) were most successful. 
Control techniques high in firmness 
(standing closer to the misbehaving 
child, continuing to look at him until 
he stopped misbehaving) were success- 
ful only for audience-children who were 
themselves deviancy-oriented at the 
time. Control techniques high in rough- 
ness (anger, physical handling) were 
least successful and tended to be fol- 
lowed by behavior disruption (less 
involvement in work, overt signs of 
anxiety) rather than conformity on 
the part of audience-children. In terms 
of their effects, it is evident that rough- 
ness is a different dimension than firm- 
ness. 

Since attitudes toward misconduct 
may also be affected by differences in 
control techniques we decided to study 
these as well. In an unpublished study 
of children at camp, P. Gump, B. 
Biddle, and J. Kounin found significant 
differences in attitudes toward camp 
misconduct held by campers who had 
effective counsellors as compared to 
campers who had ineffective counsel- 
lors. The counsellors, however, varied 
along many dimensions including puni- 
tiveness, goal-directedness, physical 
and psychological absenteeism, and 
others. The campers’ attitudes toward 
misconduct also varied according to 
whether they were talking about camp, 
home, or school milieus. We decided, 
therefore, to limit the leadership di- 
mension to punitiveness and the milieu 
to school. 

It is postulated that aggression leads 
to counteragression; it is further postu- 
lated that a punitive teacher has more 
power over her pupils than they have 
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over her and that she blocks overt 
manifestation of pupils’ agression (ob- 
servations in the classrooms of the 
punitive teachers selected for this 
study indicate that this second assump- 
tion is tenable). From these two postu- 
lations, we derive the following hy- 
potheses: 

1. That the school misconduct pre- 
occupations of children with punitive 
teachers will contain more aggression 
than those of children with nonpuni- 
tive teachers. 

2. That children with punitive 
teachers will be more conflicted about 
school misconduct than will children 
with nonpunitive teachers. 

~3. That the aggression needs and 
the conflict relating to misconduct hy- 
pothesized to exist among children 
with punitive teachers will detract 
from their concern with school-unique 
values that are not directly related to 
misconduct. 

4. The question may also be raised 
as to whether or not the amount of 
tension generated in the children with 
these particular punitive teachers is 
sufficiently great to reduce the rational 
qualities of their attitudes toward mis- 
conduct. 

METHOD 

Subjects. The subjects were 74 boys and 
100 girls attending their first semester of 
the first grade in the public schools of a 
large city. They represented all the children 
from six home rooms of three schools, in 
from upper-lower to middle-middle socio- 
economic neighborhoods. 

Procedure. Overall school climate was 
controlled by selecting pairs of punitive 
vs. nonpunitive teachers from the same 
school. Three such pairs were obtained from 
three elementary schools. 

The initial selection of punitive and non- 
punitive teachers was obtained from the 
principal and assistant principal. Following 
this the classes were observed by both 
principal investigators. At approximately a 
week later the teachers were further rated 
by a supervisor of student teachers who 
visited each class twice. 


The raters checked along a continuum 
from Extremely Punitive (threatens chil- 
dren with consequences that really hurt; 
makes threats that imply sharp dislike, real 
willingness to harm child; ever-readiness to 
punish) to Not Punitive (does not punish 
and does not threaten). A punitive vs. non- 
punitive pair of teachers was used for the 
study only when all five persons agreed on 
their dichotomizations. All the teachers 
were rated as having good organization, 
well-behaved classes, and as achieving the 
learning objectives for their grade. Eighty- 
four of the children were in classes with 
punitive teachers and 90 children were in 
classes with nonpunitive teachers. 

The children were interviewed individ- 
ually during the third month of attendance 
at school. The interview consisted of the 
questions..“‘What is the worst thing a child 
can do at school?” and, following the reply, 
“‘Why is that so bad?’’ Identical questions 
were asked regarding home as the milieu 
for misbehavior. 

Coding the Replies. The misconducts 
mentioned by the children were coded for 
content and for certain qualities or dimen- 
sions. 

The content code (obtained from the 
question of “‘What is the worst thing to 
do?’’) contained two parts: the misconducts 
and the explanations given for why these 
were bad. The misconduct included: the 
act type (physical or psychological assaults, 
noncompliance, etc.) and the object of the 
misconduct (parents or teachers, other chil- 
dren, institutional laws or custom, etc.). 

The code for the explanation of miscon- 
duct was designed to answer three ques- 
tions: Who is involved in the consequence 
(the child himself, parent or teacher, a peer, 
etc.)? What kind of sufferings result to 
others from the misconduct (physical pain, 
achievement loss, property loss, etc.)? What 
kinds of retributions occur to the misbe- 
haver (work imposal, character loss, physi- 
cal punishment, etc.)? 


RESULTS AND CONCLUSIONS 


Children probably answer the ques- 
tion of “What’s the worst thing a 
child can do in school?” with a report 
of acts that reflect their preoccupa- 
tions. It is not likely that our subjects’ 
answers would have been the same if 
they were presented with a forced- 
choice of alternative acts. Given a 
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choice, most children would probably 
rate ‘“‘stabbing someone”’ as more seri- 
ous than “talking in class.” If the 
misconducts the children talked about 
are taken to represent tension systems 
and preoccupations, we may infer 
from these the comparative impact of 
punitive and nonpunitive teachers. 

In a concurrent study of childrens’ 
attitudes toward misconduct (201 boys 
and 214 girls in the first grade of six 
public schools representing a range of 
socioeconomic backgrounds), Gump 
and Kounin (1959) found both sex 
differences and differences between 
home and school milieus. For example, 
home misconducts included more 
breaking of objects while school mis- 
conducts included more rule violations; 
parents suffered more than teachers 
in consequences but teachers retributed 
more frequently. However, parents 
were reported as retributing with more 
corporal punishment and with more 
severe punishment than _ teachers. 
There were also differences in the 
responses of boys and girls, especially 
in school. For example, girls reported 
“talking” as a school misconduct eight 
times more frequently than boys, 
whereas boys reported physical as- 
saults on peers in school more fre- 
quently than did girls. 

Consequently, the comparison of 
the responses of children with punitive 
and nonpunitive teachers was made 
separately for sexes and also for home 
and school milieus. However, on all 
comparisons of school responses the 
direction of differences between chil- 
dren with punitive and nonpunitive 
teachers was the same for both boys 
and girls. The report of results, there- 
fore, combines both boys and girls. 
Insofar as the differences between chil- 
dren with punitive and nonpunitive 
teachers are concerned, only 2 of the 
48 comparisons of home responses were 
statistically significant: home miscon- 


ducts of the children with punitive 
teachers were rated as more serious 
(p < .05, for girls only) and retribu- 
tions to the subject were more serious 
(p < .02, for boys only). It is uncertain 
whether these represent some spillover 
of the influence of punitive teachers 
onto attitudes toward home miscon- 
ducts, or whether they are chance dif- 
ferences for the number of comparisons 
made. 

The results to be reported here, then, 
refer to boys and girls combined and 
to school misconducts only. Intercoder 
reliabilities ranged from 73-95% 
agreement, with a median of 90. The 
p levels of differences are based on 
the x* test. In the case of dimensions, 
such as “seriousness,” the results 
were dichotomized into a High and a 
Low based upon as equal a break as 
was possible and resulting x*’s were 
based upon 2 X 2 tables. In the case 
of categories such as act-types falling 
into the categories of: rule violations, 
physical assaults on children, property 
damages, or nonconformance with 
adults, the x? was computed for as 
many cells as there were categories. 
At times, when one particular category 
was of interest, a 2 X 2 table was con- 
structed with that cateogry versus “all 
the rest,” providing that the overall 
table showed statistical significance. 

Following are definitions of the codes 
used which are not self-explanatory 
both for the misconducts and for the 
subjects’ explanations for the wrong- 
ness of the act. These codes appear in 
Table 1. 


I. 

A. Physical assaults include all physical 
attacks on other persons (pushing, hitting). 

B. Milieu-seriousness refers to the length 
to which the milieu would go to prevent such 
an act. The school would practically ignore 
“scratching head,’ would mildly frown at 
“‘whispering,’’ and would go to any length 
to stop burning down buildings. 

C. Coder seriousness refers to the general 
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immorality or danger in the misconduct 
considered from the point of view of the 
overall culture. (The only frame of reference 
which produced high intercoder agreement 
was when the coders took the position of an 
understanding Reformed rabbi or a Uni- 
tarian minister. Taking an unspecified role, 
or that of either a parent or a teacher pro- 
duced low intercoder agreement.) “‘Study- 
ing spelling lessons at the wrong time’’ is 
morally trivial while “maiming someone’”’ 
is morally very serious. 

D. Abstractness of misconduct refers to 
the size of coverage. It may range from a 
unique, “‘one time’’ misbehavior, such as, 
“eut a climbing rope in gym,’ to an ab- 
stract one, such as, ‘“‘be mean to other peo- 
ple.”’ 


II. 

A. A central adult is the responsible 
leader: teacher at school. 

B. A psychological loss to another is 
exemplified by “It would make her worry.”’ 

C. Seriousness of consequences to others 
range from trivial harm, such as, “‘She’d 
be anncyed” to serious ones, such as, ‘‘He’d 
die.”’ 

D. A reality-centered retribution (this is 
scored only when the perpetrator himself 
suffers in the consequence) is coded when 
the consequence of a misconduct follows 
naturally from the act-type, such as: “‘not 
study because you'll get behind in your 
work.”’ This contrasts with the response in 
which the connection between act and con- 
sequence is dependent upon a personal 
intervention of another, such as: ‘‘not study 
because teacher will make you stand in the 
corner.”’ 

E. “Reflexive justification’? was coded 
when the child gave no consequence for 
either himself or others in his explanation 
of why the act was bad. When he said the 
act is bad because “It’s not nice’’ or “It’s 
bad”’ it was called a reflexive justification. 


A. On ego-acceptability, we sought to 
determine the degree to which the respond- 
ent could see himself as the perpetrator of 
the misconduct. In an ego-alien act, the 
respondent expresses abhorrence, such as: 
“It’s dirty to hit little kids who didn’t do 
nothing to you.”’ An ego-attractive act is 
one in which the child indicates its seductive 
quality for him, such as: ‘‘Tell off a teacher 
—boy, I'd like to do that.” 

B. On the premeditation category, we 
sought to learn the extent to which the 
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TABLE 1 
A CoMPARISON OF ATTITUDES TOWARD 
Scuoo.t Misconpucts HELD BY CHILDREN 
WITH PUNITIVE AND NONPUNITIVE 
First Grape TEACHERS 
(N = 176) 


sod | Se 


I. Content and quality of 
the misconducts: 


A. Physical assaults on 17 
others 

B. Milieu-serious miscon- | 89 | 63 
ducts 

C. Coder-serious miscon- | 48 | 27 
ducts 


D. Abstract misconducts | 27| 52 
II. Content and quality of the 
explanations: 
A. Peers as objects of con- | 94| 61 
sequences 
B. Physical damage to ob- | 60 | 23 
jects of consequences 
C. Serious harm to others | 45/| 18 
D. Reality-centered ret-| 21 | 48 
ributions 
E. “Reflexive justifica-| 11 | 26 
tions’’ as explanations 
Role of self in misconducts: 
A. Ego-alien misconducts | 26| 11 
B. Premeditated miscon- | 29 
ducts 
. Aggression: 
A. Overall aggression 49 | 24 
(“blood and guts’’) 
V. Concern with  school- 
unique objectives: 
A. Learning and achieve- | 20| 43 
ment losses 
B. Institutional law vio- | 49 | 62 
lations 


III. 


Note.—All differences in percentages are significant 
at the .05 level or beyond. 

* Pu stands for those children who have punitive 
teachers; NPu refers to those children who have non- 
punitive teachers. 


child sought to do wrong. If premeditated, 
the child plans the act and intends the con- 
sequences ahead of time, such as: “Put 
thumb tacks on teacher’s chair when she is 
out.”’ If intentional, the child accepts his 
part in the wrongdoing but does not plan it, 
such as, ‘‘talk during a lesson.”’ 
IV. 

Aggression (“‘blood and guts’’) refers to 
the amount of aggression the respondent 
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expresses in his misconducts and conse- 
quences. ‘Play in the storage bin because 
somebody might get hurt’’ expresses less 
aggression than ‘‘Play in the storage bin 
because you might push a kid off and there 
could be a sharp rock down there and he 
could hit his head against it and crash open 
his skull and he would bleed and his brains 
would fall out and he'd die.”’ 


V. 

A. A learning or achievement loss is coded 
when interference with learning is the mis- 
conduct or the explanation, such as: “It’s 
bad to make noise because somebody could 
make a mistake in his work,”’ or “. . . be- 
cause then he couldn’t read good.”’ 

B. An institutional law violation is a 
violation of the rules of the school such as: 
“talk when you’re supposed to study,” 
‘not take your seat when the bell rings.”’ 


The results presented in Table 1 
may be summarized around the three 
hypotheses and the one question raised 
in the introduction: 

Punitive teachers will create or activate 
more aggresston-tension than will non- 
punitive teachers. This is strongly sup- 


ported by the data. The children who 
have punitive teachers have more 
sheer aggression in their sins and con- 
sequences, they give both more milieu- 
serious and more coder-serious mis- 
conducts, their targets suffer more 
harm, they give more physical assaults 
as act-types, and their targets suffer 
more physical harm. The targets of 
children with nonpunitive teachers are 
more inclined to suffer psychological 
losses as consequences. As an example 
of the results: of 84! respondents with 
punitive teachers, 31 give physical as- 
saults on other children and 40 n 
tion school rule violations; while of . 
children with nonpunitive teachers, | 
talk about physical assaults and i 
about rule violations. (The remainder 
of the act-types are nonconformances 
and ‘“‘miscellaneous.”’) 

Children with punitive teachers will 
be more unsettled and conflicted about 
misbehavior in school. This hypothesis 
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is supported by the findings related to 
the role of self in misbehavior. The 
children from nonpunitive teachers 
give misconducts in which their own 
role is intentional whereas ¢hildren 
from punitive teachers give both pre- 
meditated and ego-alien misconducts. 
We may say that children with puni- 
tive teachers express more abhorrence 
for the misdeeds which they have se- 
lected and yet select misdeeds which 
require “malice and forethought.” 
Punitiveness of teachers will detract 
from childrens’ concern with school- 
unique values. This hypothesis is sup- 
ported. Children from punitive teach- 
ers talk more about physical attacks 
on peers—misbehavior by no means 
unique to the classroom setting. Chil- 
dren with nonpunitive teachers talk 
more about learning, achievement 
losses, and violations of school-unique 
values and rules. 
Do children from nonpunitive teachers 
show more rational qualities in their 
responses? The answer to this question 
is not clear. Fairly direct attempts to 
measure this—codes for milieu likeli- 
hood of misconducts, for likelihood of 
consequences, and for appropriateness 
of consequences to the misconduct— 
did not show significant differences 
between the two groups. On the other 
hand, children with punitive teachers 
gave fewer abstract misconducts which 
result, in our camp study, was nega- 
tively correlated with age. But these 
same children also gave fewer reflexive 
justifications which result was posi- 
tively correlated with chronological 
age. One interpretation of the findings 
that children of punitive teachers gave 
fewer abstract misconducts and fewer 
eflexive justifications is to regard these 
3 indications of the unsettled and 
. nflicted state of the attitudes regard- 
ing misconduct held by children with 
punitive teachers. When a child is in- 
clined to misbehave but fears to, then 
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a concrete act occurs to him—“hit 
George in the mouth”; when he is not 
pressed by his needs to misbehave, 
then an abstraction occurs to him— 
“be mean to people.” Similarly, when 
he expresses this verbal act, a real 
consequence occurs—the target gets 
hurt or the perpetrator suffers a con- 
sequence; when he is not preoccupied 
with wrongdoing then a reflexive jus- 
tification occurs to him—“it’s not nice.” 
A reflexive justification at this age 
may not be a primitive reply but a 
reflection of a settled issue: ‘‘You just 
don’t do this because it’s not nice.” 

Another interpretation is to; regard 
the greater use of reflexive justification 
by the children with nonpunitive 
teachers as evidence of their greater 
trust and faith in school, i.e., of their 
internalization of school values more 
than children with punitive teachers. 
Inspection of the data showed the re- 
flexive justification was used predom- 
inately in connection with rule vio- 
lations (talking, running in halls, 
not taking seat, and the like). These 
misconducts are milieu-inconvenient 
which are disturbing to the milieu but 
which are without direct harm to 
either the actor or to others and do not 
violate an important moral code. Such 
misconducts to the first grade child 
have no real explanation except that 
“they’re bad because they say so.” 
As such, they express a sort of naive 
faith and trust in the rightness of 
what the teacher says. 


SuMMARY 


Three pairs of punitive vs. nonpuni- 
tive first grade teachers were selected 
from three elementary schools. The 
174 children in these teachers’ class- 
rooms were individually interviewed 
about what they thought was “the 
worst thing to do in school” together 
with their explanations of why these 
misconducts were bad. Regarding their 
responses as expressions of their pre- 
occupations it was concluded that, as 
compared with children who have non- 
punitive teachers, children who have 
punitive teachers: manifest more ag- 
gression in their misconducts, are more 
unsettled and conflicted about mis- 
conduct in school, are less concerned 
with learning and school-unique values, 
show some, but not consistent, indica- 
tion of a reduction in rationality per- 
taining to school misconduct. A theory 
that children with punitive teachers 
develop less trust of school than do 
children with nonpunitive teachers 
was also presented to explain some of 
the findings. 
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A major criticism of research con- 
cerned with developmental changes in 
the structure of intelligence has been 
the lack, at each age level, of com- 
parable test batteries and subjects 
(Anastasi, 1948). It can be argued that 
inappropriate test content at one age 
level or another serves to lower relia- 
bility which in itself could account for 
age changes. Differences in variability 
of scores can similarly affect the usual 
statistical procedures employed in 
these kinds of studies. Accordingly, 
longitudinal data would seem to be 
one means of reducing the effects of 
these variables. The writers have for- 
tunately been able to employ a sample 
of. 100 subjects who were examined on 
two occasions separated by a time 
interval of approximately 3.5 years 
with a test standardized for both de- 
velopmental levels. The purpose of 
this paper is to present several analy- 
ses of the data appropriate to each of 
the following issues: developmental 
changes in the magnitude of g, con- 
sistency of relative position and con- 
sistency of profiles, consistency of 
factorial structure (factor validity), de- 
velopmental changes in magnitude of 
sex differences and rates of growth on 
each of the primary abilities, and ac- 
curacy of long range predictions of 
achievement. 


The Primary Mental Abilities Test 
(PMA), Intermediate Form (age 11-17), was 
administered to all of the eighth graders 
present in a junior high school located in 
a small industrial community.' Approxi- 
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mately 3.5 years later, during the last week 
of Grade 11, the same form of the test was 
readministered, at which time only 8% of 
the original sample were, for a variety of 
reasons, not in attendance. As shown in 
Table 1, the final sample of 100 youngsters, 
49 boys and 51 girls, scored somewhat above 
average on the test (Mig = 107.95) with the 
variability comparing favorably with the 
standardization group (¢ = 16.56). As a 
safeguard against selective factors influenc- 
ing the data, only those subjects available 
at both testing sessions were included in the 
final analysis. An important feature of this 
sample is the fact that the subjects are 
homogeneous with respect to socioeconomic 
background (lower middle class) and educa- 
tion. Most of the boys were in the general 
education program whereas most of the 
girls were enrolled in a commercial program. 

The PMA yields scores on the five pre- 
sumably independent traits which Thur- 
stone (1938) has defined as constituting in- 
telligence. These five traits include Verbal 
Meaning (V), Space (S), Reasoning (R), 
Numerical (N), and Word Fluency (W). In 
addition a Total score (T) is available. The 
reliabilities of each of the subtests as well 
as the Total score are quite satisfactory 
(Thurstone, 1958). It would have been help- 
ful to have available reliabilities by age or 
grade levels, but unfortunately such data 
are apparently unavailable in published 
form. Since the test was standardized for an 
age group inclusive of the present sample, 
the assumption is made that reliabilities 
are essentially comparable over the age 
range inclusive of Grades 8 and 11. In addi- 
tion to the foregoing psychometric features, 
this scale was employed because the types 
of analysis planned had not been previously 
reported in the literature for the PMA. 

On the day following the administration 
of the PMA, the Myers-Ruch High School 
Achievement Test (MRT) was adminis- 
tered. Eight students were absent for this 
test so that analyses including the MRT are 


Duquesne, Pennsylvania, and to H. Me- 
Keegan, guidance counselor at the high 
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of this study. 
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TABLE 1 
MEANS AND STANDARD DEVIATIONS OF 
CHRONOLOGICAL AGE AND IQ at GRADE 
8 anp Grape ll 


— Grade 8 Grade 11 

=i G T | B | c 
N 49 51 100 49 | 51 100 
Mea | 12.84| 12.72| 12.78 16.98 | 16.17 | 16.98 
Coca .43 57 62 42 .55 
Miq | 107.69 | 108.20 | 107.95 | 106.24 | 111.65 |108.51 


17.78 | 14.41 | 16.56 15.81 | 14.30 | 15.05 


based on an N of 92, 46 boys and 46 girls. 
The MRT was selected because of the short 
testing time (one hour) and the fact that it 
samples a variety of subject matter areas. 
It would probably have been more desirable 
to use one of the achievement tests yielding 
a variety of specific scores rather than an 
overall score, as does the MRT, but the 
time necessary to administer such tests was 
not available. 


RESULTS 


Developmental Changes in the Magni- 
tude of g 

Garrett (1946) in his well known 
developmental theory of intelligence 
suggests that the relative dominance 
of Spearman’s g decreases with age, 
resulting in the emergence of specific 
factors. The inconsistent findings re- 
ported in the literature would seem to 
be the result of the methodological 
problems inherent in the cross-sec- 
tional approach (Anastasi, 1948). A 
study by Asch (1936), who retested in 
Grade 7 the subjects originally tested 
by Schiller (1934) in the third grade, 
deserves closer attention because of 
the longitudinal approach employed. 
Consistent with Garrett’s position, 
lower correlations between the verbal 
and numerical tests were found after 
the 4 years. It should be noted, how- 
ever, that only 161 of the original 
sample of 395 were available for the 
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second testing. In addition to the ef- 
fects of losing 59% of his original 
sample, the degree to which the tests 
were appropriate for the older subjects 
was questionable. 

The present data were analyzed in 
a manner consistent with earlier stud- 
ies so as to provide a basis for compari- 
son. Several investigators (Asch, 1936; 
Garrett, Bryan, & Perl, 1935) have 
reported median r’s derived from inter- 
correlations among the subtests at 
each age level. A similar analysis pre- 
sented in Table 2 for the total group 
(N = 100) reveals a slight decrease 
(.02) in the magnitude of the inter- 
correlations. In a separate analysis by 
sex, also shown in Table 2, the median 
intercorrelations for boys were .31 and 
.28 and for the girls they were .32 and 
.27, indicating little or no sex differ- 
ence. A second analysis, involving the 
determination of the proportion of 
variance accounted for in the first 
unrotated centroid factor, supports the 
foregoing conclusions (the percentages 
being 39% and 37%, respectively). 
Since both analyses are assumed to 
be related to the magnitude of g, it 
may be concluded that within the age 
range of the sample there is little or 
no evidence for increased differentia- 
tion of abilities. It should be noted, 
however, that several of the intercor- 
relations are moderately high, particu- 
larly those involving the V subtest, 
suggesting that the g factor is present. 

Previous research (Meyer, 1960; 
Thurstone, 1938; Thurstone & Thur- 
stone, 1941; Tyler, 1958) along with 
the present data include a develop- 
mental span ranging from Grade 1 
through senior high school and up to 
the age of 25 permitting tentative 
conclusions concerning age-grade 
changes in performance on the PMA. 
Tyler (1958) reports a longitudinal 
study in which the performance of the 
youngsters in the first grade is com- 
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TABLE 2 


INTERCORRELATIONS AMONG PRIMARY MENTAL ABILITIES SUBSCALES AT GRADE 8 
AND AT GRADE 11 


Mean Correlation 


Grade 8 Grade 11 


-06 
15 


31 


18 


Note.—Grade 8, above principal diagonal; Grade 11, below principal diagonal. 


pared with their performance when in 
the fourth grade which in turn is com- 
pared with their eighth grade scores. 
From Grade 1 to Grade 4, it was found 
that total score predicted subsequent 
performance on each subtest about as 
well as prior performance on the specific 
subtest itself. The comparison from 
Grade 4 to Grade 8 revealed that for 
N the correlation between correspond- 
ing subscores was considerably higher 
than the correlation between Grade 4 
T score and Grade 8 N score. Meyer 
(1960), who performed the same analy- 
sis on the PMA from Grade 8 to Grade 
11, found that in addition to N the 
S subtest also emerges. It should be 
noted, however, that all of the corre- 
sponding subtest correlations were 
higher than the correlations between 
total score and each subsequent sub- 
score. Thurstone (1938) working with 
eighth grade children reports a correla- 
tion between the six primary factors 
(which he called a second-order general 


factor), but in another study (Thur- 
stone & Thurstone, 1941) in which 
the age range was from 16 to 25, no 
such second-order factor emerged. 
From these studies, then, the conclu- 
sion may be made that in the primary 
grades (1-4) the PMA abilities meas- 
ure general ability whereas from Grade 
4 to 8 numerical ability emerges from 
the others as a more discrete aptitude. 
After Grade 8, a spatial ability 
emerges, but not until the Age Group 
16 to 25 is there evidence for inde- 
pendent factors. 


Consistency of Relative Position and 
Profile Differences 

An important characteristic of any 
test is the degree to which present 
performance predicts future ratings on 
the particular variable. Certainly many 
things can happen to an individual 
which might effect a significant change 
in his performance relative to his group 
which would serve to minimize the 
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importance of prior information. Dop- 
pelt and Bennett (1951) examined the 
long range consistency of the Differ- 
ential Aptitude Tests (DAT), report- 
ing rather high correlations over a 
period of 3 years with very little differ- 
ences between the sexes. Certain tests, 
such as Verbal Reasoning, and Nu- 
merical Ability, showed more consist- 
ency over time than tests such as 
Abstract Reasoning, Space Relations, 
and Clerical Ability, prompting the 
speculation that such differences in 
consistency are related to a uniformity 
of experiences in the former abilities 
whereas the latter, being nonschool 
subjects, are more subject to individ- 
ual interests and experiences. The 
present data are relevant to this hy- 
pothesis. 

Before considering the data analysis 
procedures followed in this section, 
it will be necessary to examine the re- 
liability of the subtests. Obviously if 
the immediate reliabilities of the sub- 
scales are low, consistency measures 
over time would be meaningless. The 
following split-half coefficients have 
been reported in the manual (Thur- 
stone, 1958) for tenth graders: V, .92; 
S, .96; R, .93; N, 89; and W, .72.2 
These reliability coefficients are suf- 
ficiently high so that any observed 
changes in relative position over the 
3.5 years could reasonably be ascribed 
to factors uther than errors of measure- 
ment. 

A summary of the analysis is pre- 
sented in Table 3. These test-retest 
correlations are based on the same form 
of the PMA and, for the total group, 
are remarkably high except for W. The 
fact that performance on V, R, and N 
are more stable over time than are S 


? Reliability for the Word Fluency subtest 
was determined by use of the separately- 
timed halves technique originally reported 
by Anastasi and Drake (1954). 


TABLE 3 
CORRELATIONS BETWEEN GRADE 8 AND 
Grape 11 Scores ON THE PRIMARY 
MENTAL ABILITIES TEST 


Girls Total 


-51 (.84)|.77 (.80)|.66 (.69) 
Reasoning (.81)).71 (.76)|.75 (.81) 
Number (.93)).63 (.71)|.73 (.82) 
Word Fluency .52 (.72)).27 (.38) .43 (.59) 


Verbal Mean- | 88 (.96)|.74 (.80) 
ing 
Space 


81 (.88) 


Note.—Entries in parentheses are coefficients cor- 
rected for attenuation. 


and W is consistent with the findings 
Of Doppelt and Bennett (1951) de- 
scribed above and would seem to sup- 
port the notion that the academic 
abilities are more stable than the non- 
academic. In contrast with previous 
findings, our data suggest the existence 
of sex differences in consistency. Sta- 
tistical tests support this observation 
for V, 8, and N (z = 2.08, 2.20, 2.17, 
respectively), but not for W and R 
(z = 1.45 and .52, respectively). These 
data can be interpreted as meaning 
that over the Grade Range 8-11 the 
relative position of boys as contrasted 
with girls is generally more stable. 

Another problem of particular im- 
portance when dealing with specific 
traits as on the PMA is that of profile 
stability. Guidance counselors in par- 
ticular are concerned with the long 
term stability of difference scores since 
they are often used to determine a 
person’s future program of study. This 
problem has also been examined by 
Doppelt and Bennett (1951), for the 
DAT, who report a median correlation 
of .50 for all possible combinations of 
differences between test scores with a 
range from .20 (Numerical Ability 
minus Space Relations) to .74 (Me- 
chanical Reasoning minus Spelling). 
Little in the way of sex differences was 
noted. 
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TABLE 4 
Retest CORRELATIONS BETWEEN PRIMARY 
MENTAL ABILITIES SUBTEST DIFFERENCE 
Scores aT GRADEs 8 AND 11 


PMA Subtests w 
V-Ss .30 —.12 .14 
V-R .65 .19 .42 
V-—wN .68 .10 .39 
.56 — .02 
Ss -R .53 .07 
-—N .66 —.10 .3l 
Ss .44 .06 .33 
R-wN .76 —.19 .25 
.50 — .03 .29 
N-W .36 — .02 .15 


The correlations presented in Table 4 
were computed by taking all the pos- 
sible nonredundant differences between 
subtests at each grade level and com- 
puting product-moment r’s for cor- 
responding pairs. The median r for the 
total group is .35 with a range from .14 
(V minus 8) to .42 (V minus R). These 
data are certainly not encouraging sup- 
port for the use of profile differences. 
Further examination of Table 4 shows 
a higher median r for boys (.55) than 
for girls (.08) which was to be expected 
in the light of our previous findings 
concerning consistency of performance. 
A tentative explanation for these sex 
differences is that the academic pro- 
gram pursued by the boys is more in 
conformity with the abilities measured 
on the PMA at both grade levels 
whereas for the girls the commercial 
courses have the effect of changing 
their relative strengths and weaknesses 
in Grade 11 in contrast to what they 
were in Grade 8. 

The generally lower correlation of 
difference scores noted for the PMA in 
contrast with the DAT requires fur- 
ther examination. The most likely 
explanation for these divergent results 
lies in the different statistical tech- 
niques used for analysis. Doppelt and 
Bennett (1951) computed their meas- 
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ures of relationship by means of a 
formula’ which assumes equal vari- 
ability among the variables, an as- 
sumption for which they do not offer 
supporting evidence. Judging from the 
data presented in the manual for the 
DAT, however, it would appear that 
the variance in Grade 12 would have 
been considerably larger than in Grade 
9. If the assumption were seriously 
violated however their approach would 
yield an overestimate of the degree of 
relationship. Since our data did not ful- 
fill the assumption of homogeneity of 
variance Pearson product-moment cor- 
relations were used. The median 
magnitude of relationship for the total 
group, using the Doppelt-Bennett ap- 
proach, would have been .55, ranging 
from .46 (R minus W) to .68 (S minus 
N), which represents a substantial 
improvement over the results derived 
from product-moment r’s. 


Consistency of Factorial Structure 


In the analysis presented in this 
section our primary concern is with the 
degree to which the factor loadings on 
each of the PMA subtests change over 
time, i.e., factor stability. This prob- 
lem is of importance not only for the 
theoretical issues involved but also in 
terms of more practical considerations. 
To our knowledge empirical data have 
not previously been reported on this 
issue for the PMA. 

The scores of the 96 subjects who 
were present for both administrations 
of the PMA and the MRT were inter- 
correlated for boys and girls separately 
and for the total group. Each of the 
11 X 11 matrices was factor analyzed 
by the complete centroid method with 
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where (1-2) and (I-II) represent the differ- 
ence obtained at first and second testings, 
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TABLE 5 


First- anp Seconp-ORDER REFERENCE VECTOR MATRICES AND INTERCORRELATIONS 
AMONG First- AND SEcoND-OrRDER PRimaRy MENTAL ABILITIES Factors 


Boys Girls 
(N = 46) (N = 46) 


Correlations among fac- 
tors 


Second-order vector 
loadings 


Factor intercorrelation 


N 
| 
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six orthogonal factors being extracted.‘ 
The fifth centroid factor showed, in 
each matrix, one or more loadings 
greater than .30, while none of the 
loadings on the sixth factor reached 
.20. Consequently, only the first five 
factors were retained in the subsequent 
rotation. The centroid factor loadings 
were rotated to oblique simple struc- 
ture using the oblimax analytic rota- 
tion criterion developed by Pinzka and 


4 The authors are indebted to Gary Lotto 
and William B. Kehl for providing the fa- 
cilities of the University of Pittsburgh 
Computation and Data Processing Center 
for part of the statistical analyses. A table 
giving the unrotated centroid factor load- 
ings and transformation matrices has been 
deposited with the American Documenta- 
tion Institute. Order Document No. 6544 
from ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Con- 
gress; Washington 25, D. C., remitting in 
advance $1.25 for microfilm or $1.25 for 
photocopies. Make checks payable to: Chief, 
Photoduplication Service, Library of Con- 


gress. 


Saunders (1954) and the correlations 
(direction cosines) among the oblique 
reference vectors were converted into 
factor intercorrelations by the pro- 
cedure detailed by Cattell (1952, pp. 
224-232). Two centroid second-order 
factors were extracted from the factor 
correlation matrices and rotated to 
oblique simple structure using the 
same procedures. 

Table 5 gives the oblique vector 
loadings of each test, the correlations 
among the primary factors, the vector 
loadings of the five primary factors on 
two second-order factors for each sex 
group and alsofor the combined groups. 
The primary factors within each anal- 
ysis are clearly Thurstone’s V, 8, R, N, 
and W factors, but the factor intercor- 
relations are quite large in several cases, 
indicating that the PMA factors can- 
not be considered as orthogonal or 
independent. The presence of a single 
factor at the second-order level might 
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be considered as evidence of a general g 
factor unifying the five primary factors, 
but it appears that two factors must be 
postulated at the second-order level to 
account adequately for the factor inter- 
correlations at the first-order level. The 
large correlations between the second- 
order factors suggest a general g fac- 
tor is present but at the third-order 
rather than at the second-order level. 

Certain interesting sex differences in 
the factor structure of the PMA scales 
are apparent in the separate sex group 
analysis that are obscured when the 
groups are combined. The V scale ap- 
pears to be less well defined at the 
eleventh grade level for the girls (load- 
ing of .36) although its loading is quite 
high at the eighth grade level (.69) and 
at both grade levels for the boys (.54 
and .62). The V scale also has a con- 
siderable loading on the N factor (.37) 
at the twelfth grade level for girls al- 
though none of the other scales simi- 
larly load on two primary factors in 
either matrix. Second-order Factor A 
has loadings on both the S and W pri- 
mary factors and Factor B on the V, R, 
and N primary factors for both sex 
groups. However, Factor V divides its 
variance between Factors A and B for 
the boys while V is loaded only on B 
for the girls. In spite of these apparent 
sex differences the intercorrelations 
between Factors A and B are quite 
similar in both sex groups (.66 and .65), 
but this correlation is much larger when 
all subjects are combined into the total 
group (.87). 

Considering first the results for the 
total group it seems quite clear that a 
high degree of consistency among the 
PMA factors prevails from Grade 8 
through Grade 11. It can be concluded 
from these data that the structure of 
intelligence changes very little over the 
time period included in this study. The 
large correlation between the factors 
suggests that the structure of intelli- 
gence is not composed of independent 
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traits but rather that these traits share 
a common source of variation. The 
foregoing interpretation is consistent 
with Vernon (1950) who proposes a 
hierarchical structure of intelligence 
with Spearman’s (1927) g as the major 
integrating factor. 

Considering next the data for the 
boys and girls, it can be concluded that 
factorial stability is greater for boys 
than for girls. That factorial structure 
changes for the girls as reflected in the 
lower factor loading on V in the elev- 
enth grade as compared with its load- 
ing in the eighth grade is not surprising 
in view of the previously reported 
findings. A tentative explanation of 
these findings will be discussed in a 
later section of this paper. 


Developmental Changes in Magnitude 
of Sex Differences and Rates of 
Growth of Each of the Primary 
Abilities 

Herzberg and Lepkin (1954) report 
comparisons between boys and girls at 
each of three age levels (16, 17, 18) on 
the subtests of the PMA. At each age 
level the S subtest was significantly 
higher for the boys. Scores on the sub- 
tests V, R, and W were significantly 
higher for the girls at age 17 but not at 
age 16, and at age 18 only W was 
reliably higher for the girls. In evaluat- 
ing these data it should be noted that 
all the subjects were high school sen- 
iors so that the three age levels actually 
represent different intellectual levels, 
thus confounding any conclusions con- 
cerning sex differences at various levels 
of development. The present data are 
not hampered by this limitation. 

Sex differences on each of the sub- 
tests were determined for each grade 
level by means of ¢ tests. As shown in 
Table 6, none of the differences were 
statistically significant in Grade 8 but 
in Grade 11 V, R, N, and W emerge as 
being significantly different in favor of 
the girls. Though scores on the space 
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TABLE 6 
Sex DIFFERENCES ON Primary MENTAL 
ABILITIES SUBTESTs aT GRADES 
8 AND ll 


Grade 8 Means Grade 11 Means 


Boys | Girls 

16 .83|17.89| 

(15.19/17 .73)1. 

(29. 28133 .00 


factor favor the boys at both grades, 
the differences are not statistically 
significant. To the degree that our 
Grade 11 subjects are comparable to 
. the 16-year-olds of Herzberg and Lep- 
kin, the correspondence of findings 
would permit the conclusion that elev- 
enth grade girls achieve higher scores 
on V, R, N, and W than boys. 


Long Term Prediction of Achievement 


This section is concerned with the 
effectiveness of each of the subtests to 
predict achievement 3.5 years later. 
The correlations presented in Table 7 
are based on Grade 8 subtest scores and 
Grade 11 achievement as well as Grade 
11 subtest scores and Grade 11 achieve- 
ment. It will be recalled that the meas- 
ure of achievement is performance on 
the MRT. The magnitude of these r’s 
is fairly impressive, particularly the V 
subtest for boys. Generally these data 
are in agreement with other studies 
(Shinn, 1956; Thurstone, 1958; Well- 
man, 1957), particularly with respect 
to the overall superiority of V in pre- 
dicting achievement. The sex differ- 
ences in predictive power agree with 
Shinn (1956) who found small but 
consistent differences in the magnitude 
of r in favor of boys. That our differ- 
ences are larger is probably attribut- 
able to the greater proportion of our 
girls being enrolled in the commercial 


TABLE 7 
CoRRELATIONS OF Primary MENTAL ABILI- 
TIES SUBTESTS ADMINISTERED AT GRADES 
8 1l wits ScHoo. 
ACHIEVEMENT Test ADMINIS- 
TERED AT GRADE 11 


Boys Girls | Total 
(N = 46) (V=4) | W=92) 


Subtest 
\Grade Grade Grade Grade 
il 


Grade Grade 
8 ll 


.58 
-30 
-36 

31 
14 


| 


program wherein preparation for a test 
such as the MRT is less effective than 
a more academically oriented course of 
study. 

Since it is ordinarily expected that 
concurrent validity is higher than long 
term validity, the lower correlations 
between eleventh grade achievement 
and eleventh grade PMA scores than 
between eighth grade PMA scores and 
eleventh grade achievement requires 
some consideration. After discarding 
several psychological hypotheses as 
being illogical or untestable, the writers 
next examined the data for possible 
statistical explanations. One possibil- 
ity considered was the difference in 
variances from Grade 8 to Grade 11 on 
the PMA subtests but, it will be re- 
called, the Grade 11 variances are 
greater which should have led to larger 
correlations. Still attending to the 
difference in variances, a testable sta- 
tistical hypothesis was developed. 
Considering the definition of r as the 
ratio of the covariance to the product 
of the standard deviations of the two 
variables® it can be shown that if the 


* This formula symbolically is 


r= 
from which it can be shown that the ratio 
of any pair of correlations is proportional 
to the ratio of the standard deviations. 
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covariance remains constant and the ¢ 
of one of the two variables remains 
constant then any increase in the size 
of the second o must reduce the mag- 
nitude of the resulting correlation co- 
efficient. In the present problem the ¢ 
remains constant for the achievement 
test variable but not for the PMA sub- 
test variable which is larger for the 
eleventh grade than the eighth. The 
crucial question remaining is whether 
or not the covariation between Grade 8 
subtest scores and Grade 11 achieve- 
ment scores is similar to the covariation 
between Grade 11 PMA scores and 
Grade 11 achievement scores. Direct 
computation of the essential covari- 
ances indicated that in every case they 
were equal or approximately equal. 
These latter findings then permit the 
conclusion that the observed differ- 
ences in predictive validity are merely 
statistical artifacts. Nevertheless it 
should be noted that the greater vari- 
ability in the eleventh grade PMA 
scores does not improve their predic- 
tive power. We believe this occurs 
because the harder items at the older 
age level are less appropriate discrimi- 
nators for a test such as the MRT. Cer- 
tainly further research is necessary, 
using a variety of achievement meas- 
ures, before these assertions will have 
any generality. 


DIscussIoN 


Of all the conclusions warranted by 
this study the most surprising are those 
involving sex differences. The fact that 
scores on the subtests V, R, and W 
were significantly better for the girls is 
consistent with other data, but that 
these girls perform reliably better than 
the boys on N is clearly inconsistent 
with other studies (Herzberg & Lepkin, 
1954). In addition the consistently 
higher correlations found for the boys 
in contrast to the girls is unique in the 
literature. This latter finding is prob- 


WILLIAM J. MEYER AND A. W. BENDIG 


ably best explained statistically in that 
the performance of the boys is more 
variable in Grade 11 than that of the 
girls and this, combined with the higher 
mean performance of the eleventh 
grade girls, suggests the presence of a 
ceiling effect for the girls. As a basis for 
a tentative explanation of the observed 
sex differences, we postulate that the 
training received by the girls in the 
commercial program leads to changes 
in the organization of intellectual 
abilities which do not result from the 
academic program of study followed by 
the boys. Recall that for the eleventh 
grade girls the V factor splits its vari- 
ance with the N factor indicating that 
these factors overlap. We postulate 
that this overlap represents the emer- 
gence of a clerical ability which reflects 
an emphasis on training for accuracy 
and speed that is not necessarily 
stressed to the same degree in the 
academic program. But it is exactly 
these abilities which are important to 
good performance on the N subtest and 
to a lesser degree on the V subtest. If 
our analysis is correct, then the supe- 
rior performance of the girls on the N 
subtest reflects their greater training 
on this type of task. This explanation 
is consistent with the broad cultural 
interpretation usually given similar 
data. 

The writers recognize that the fore- 
going assertions are highly speculative 
but in the absence of any data of a 
similar nature we have attempted to 
put forth an idea that might be tested 
in a subsequent study. It may be that 
other interpretations will prove more 
fruitful but in any event with the large 
numbers of girls enrolled in commercial 
courses, it would seem important that 
some attempts be made to replicate 
our findings. 

Though the purpose of this study did 
not include an evaluation of the PMA, 
such a statement would appear war- 
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ranted. Certainly the test-retest cor- 
relations for the Factors V, R, and N 
are sufficiently high to permit cautious 
long term counseling. This is fortunate 
since those factors correlate most 
highly with achievement. The Factors 
S and W are quite unstable and appear 
to contribute little towards the predic- 
tion of future behavior. It should be 
noted here that despite the claims 
made, the V subtest is still the best 
predictor of achievement, is the most 
stable over time, and, importantly, 
correlates highest with the other sub- 
tests and with total score both concur- 
rently (.73) and long term (.64). The 
second best predictor of achievement is 
the T score (.47) which is also the most 
stable score (.82). Though long term 
consistency in relative position is satis- 
factory, the same cannot be said for 
the consistency of differences between 
subtests. Profiles as derived from the 
PMA would appear to have little value 
in differential prediction or for coun- 
seling purposes. On the basis of the 
several studies reporting sex differences 
on the PMA subtests it seems quite 
clear that separate norms for boys and 
girls are needed if the test is to be used 
intelligently. Finally we should like to 
call for further research on the reasons 
for observed sex differences and for 
factorial validation studies employing 
samples other than those pursuing 
academic programs in the high schools 
and colleges. 


SUMMARY 


A sample of 100 eighth graders was 
administered the Primary Mental 
Abilities Test, Intermediate Form, and 
was readministered the same test as 
they were concluding the eleventh 
grade. Several analyses of the resulting 
data were performed which showed 
that relative position on the PMA sub- 
tests is fairly well maintained over 
time but that differences between sub- 


test scores are quite unstable. Though 
no sex differences were noted at Grade 
8, there were significant differences in 
favor of the girls at Grade 11 on the V, 
R, N, and W subtests. In addition to 
these sex differences it was noted that 
girls were less consistent in perform- 
ance than the boys and their scores 
correlated less well with achievement, 
as measured by the Myers-Ruch High 
School Achievement Test. The Verbal 
Meaning subtest predicted achieve- 
ment most effectively, with the Total 
score ranking second. Analyses were 
also made relevant to the problem of 
the structure of intelligence and the 
degree to which specific factors emerge. 
The absence of evidence for increased 
differentiation of abilities as noted 
from Grade 8 to Grade 11 and the 
highly correlated second-order factor 
was viewed as support for Vernon’s 
hierarchical-structure theory of intel- 
ligence. 
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Psychology in Teaching and Learning 
William Clark Trow, University of Michigan 
“An exceptionally lucid text which accomplishes the difficult task of separating the 
role of the teacher from the processes of children’s learning.” 

Robert E. Grinder 

University of Hawaii 
“The text is well organized and has an excellent coverage of what I feel are the funda- 
mentals of a course in Educational Psychology... .” 

Helen Gettys 


St. Foseph Funior College 
488 pages 1960 $s.7s Instructor’s and Student’s Manuals available. 
Mental Hygiene in Elementary Education 
Dorothy Rogers, State Teachers College, Oswego, New York 


“This book fills a definite place in the literature for teacher preparation. It spotlights 
the need of cognizance on the part of teachers as to the importance of mental hygiene 


in the classroom.” 
Joseph I. Fisher 
University of South Dakota 
“It’s an excellent book and productive of discussion and further exploration. The read- 
ability and format of the book are excellent.” 
Leonard W. Rockower 


Adelphi College 


. Houghton Mifflin Company . Boston 
New York Geneva Dallas Palo Alto 


Coming Spring 1961... 


SELECTED READINGS 
ON THE LEARNING PROCESS 


Edited by THEODORE L. HARRIS, Professor of Education, 
University of Wisconsin, and WILSON E. SCHWAHN 


This new collection of readings in educational psychology 
makes readily available for classroom use over thirty ex- 
perimental studies in learning by such eminent investiga- 
tors as Hull, Maier, Sherif, Lewin, Bruner and Goodman, 
Piaget, and Tyler. The studies emphasize the most signifi- 
cant functions and problems related to the processes of 
behavioral change in various types of learning. They pre- 
sent a substantial discussion of the central topic of ed- 
ucational psychology—the learning process. 

1961 487 pages illustrated paperbound $3.50 


OXFORD UNIVERSITY PRESS 417 Fifth Avenue, New York 16 
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“Perhaps my overall reaction is best indicated 
by the fact that, as of now, if I were to choose a 
book for the introductory educational psychology 
course, this is the one I would use.”’ (from an ad- 
vance reader’s report) 


Herbert J. Klausmeier’s 
LEARNING AND HUMAN ABILITIES: 
EDUCATIONAL PSYCHOLOGY 


coming in March from 
Harper & Brothers, 49 E. 33d St., N.Y. 16, N.Y. 
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